Understand and Evaluate the Sampling Bias in Geotagged Social Media Data
Topics:
Keywords: sampling bias, representativeness, geotagged social media data, Twitter
Abstract Type: Paper Abstract
Authors:
Ruowei Liu, University of Georgia
Xiaobai Yao, University of Georgia
,
,
,
,
,
,
,
,
Abstract
There have been growing interests in geotagged social media in recent decades. Twitter is one of the most popular geotagged social media datasets that has been applied in all kinds of cross-disciplinary research. Despite the success of such research, the sampling bias of such data is still under-investigated. Sampling bias is a misrepresentation of the population when some members of the population have a lower or higher sampling probability than others. Existing articles have discussed how to understand the sampling biases in geotagged social media data. However, those discussions are often on an ad-hoc basis and most of them do not examine the sampling bias in a theoretical framework for a more comprehensive understanding. In response, this research aims to offer several contributions to this widely discussed and urgent research question about the sampling bias of geotagged social media data. First, the research studies the nature of sampling bias in geotagged social media data and proposes a conceptual framework for it; Second, the research attempts to not only estimate sampling bias in geotagged social media data but also analyze the spatial and temporal patterns of the sampling biases. Social media has already been one of the primary ways for people to express their opinion. As sampling bias exists, voices from certain groups of people may not be heard. Therefore, the importance of understanding the sampling bias of geotagged social media data should not be ignored. Foreseeable is the increasing prosperity of the research on this topic.
Understand and Evaluate the Sampling Bias in Geotagged Social Media Data
Category
Paper Abstract