A quick note:
"Mashups showing the geographic location of the authors of social media content are popular. They generally depend on the authors reporting their own location. For blogs, auto-mated geolocation strategies using IP address and domain name are not adequate for determining an author’s location. Instead, we detail textual geolocation techniques suitable for tagging social media data, facilitating development of geo-graphic mashups and spatial reasoning tools."
As noted in this paper, trying to geolocate blogs from IP addresses or domain names does not work well due to many blogs mapping to various hosting services. To locate the named entities that refer to a potential location, the authors use GeoNames (This is a really great, free resource for those working with geographical analysis). The location of the blogger is disambiguated by clustering occurances of different geographical named entities in a region. From the paper:
For example, if “Maryland” has already been disambiguated as the U.S. state of Maryland, then the text “Laurel, Maryland” leads us to disambiguate “Laurel” as Laurel, Maryland.
Location is an important element in social media analysis and any results (for example, sentiment analysis or trends) that could be pivoted on this dimension are especially useful in providing an insight into what people in different regions think about a particular topic.
In context of this work, there are a few more related papers (in addition to references in this paper) one might want to check out. Although these papers dont address the same issue, they are interesting from the perspective of location analysis:
- Recent study of "Iran's online public" by Harvard researchers.
- "Spatial variation in search queries" by Backstrom et al.
- A study of blog commentsof persian bloggers by Noor Ali Hasan
- In our work with twitter analysis, we did some basic mapping of location names mentioned in user profiles to lat/long coordinates using Yahoo API.