This is a nice figure comparing the population pyramid of London residents and Twitter users in London. The figure comes from a new publication by Paul Longley and colleagues (UCL). Interesting how one chart can tell so much about one of the traps that come with Big Data, such as gender and age bias. Check Tim Harford's talk on the Big Data Trap.
Read more info about the paper below.
[image credit: Longley et al, 2015]
Longley P A, Adnan M, Lansley G, 2015, "The geotemporal demographics of Twitter usage" Environment and Planning A 47(2) 465 – 484.
Presentation, ungated version of the paper
This paper presents a preliminary empirical evaluation of the strategic importance of infusing Twitter social media data into classifications of small areas, as a way of moving beyond the nighttime residential geographies of conventional geodemographic classifications. We attempt an empirically based critique of the merits and drawbacks of the use of social media data, in which the value of high spatial and temporal granularity of revealed activity patterns is contrasted with the paucity of individual attribute information. We apply new and novel methods to enrich the profiles of Twitter users in order to generalize about activity patterns in London, our case-study city. More insidious problems in the use of social media data arise from the as-yet-unknown sources and operation of bias in their user bases. Our contribution is to begin to identify and assess the biases inherent in social media usage in social research, and use these to evaluate their deployment in research applications.