Conrad Hackett points out to this interesting paper published by A. Baronchelli, B. Gonçalves and colleagues. If you are interested in social networks, big data, spatial analysis, etc, you should take a look at their work.
Paper: The Twitter of Babel: Mapping World Languages through Microblogging Platforms. PLoS One.
Abstract:
[...] we survey worldwide linguistic indicators and trends through the analysis of a large-scale dataset of microblogging posts. We show that available data allow for the study of language geography at scales ranging from country-level aggregation to specific city neighborhoods. The high resolution and coverage of the data allows us to investigate different indicators such as the linguistic homogeneity of different countries, the touristic seasonal patterns within countries and the geographical distribution of different languages in multilingual regions. This work highlights the potential of geolocalized studies of open data sources to improve current analysis and develop indicators for major social phenomena in specific communities.
Twitter users per capita
[image credit: Mocanu et al 2013]
Multiscale view of the geolocated Twitter signal