Scholars increasingly use Twitter data to study the life sciences and politics. However, Twitter data collection tools often pose challenges for scholars who are unfamiliar with their operation. Equally important, although many tools indicate that they offer representative samples of the full Twitter archive, little is known about whether the samples are indeed representative of the targeted population of tweets. This article evaluates such tools in terms of costs, training, and data quality as a means to introduce Twitter data as a research tool. Further, using an analysis of COVID-19 and moral foundations theory as an example, we compared the distributions of moral discussions from two commonly used tools for accessing Twitter data (Twitter's standard APIs and third-party access) to the ground truth, the Twitter full archive. Our results highlight the importance of assessing the comparability of data sources to improve confidence in findings based on Twitter data. We also review the major new features of Twitter's API version 2.
Tools, costs, skill sets, and lessons learned," Politics and the Life Sciences 41(1), 114-130, (2 March 2023). https://doi.org/10.1017/pls.2021.19