How researchers studied COVID-19 on Twitter

  By Nathalia    16 March 2021

  By Nathalia

  16 March 2021

A year since coronavirus swept the world as we know it off its feet, billions of conversations related to the pandemic have taken place on Twitter.  From conversations that connected people to valuable information and resources, to people coming together to share their experiences, Twitter has become one of the largest repositories of public data to understand context, perceptions, and the evolution of discussions around COVID-19.

To enable more of this work, last year Twitter built and opened applications for a COVID-19 stream endpoint to help researchers and developers access data for studies that could support the public good. While new applications to the COVID-19 stream have since closed to make room for the new academic research product track, a number of people around the world continue using this data to make an impact. Below, we’re spotlighting a few stories of how researchers are using Twitter data to study the public conversation around COVID-19.

Who’s using the COVID-19 stream

Over 100 researchers and developer teams were granted access to the COVID-19 stream after a review process. All applications were manually reviewed for four things: first, does the application demonstrate familiarity with the Twitter API and the computational resources required to handle the consumption of a high volume of unstructured data in realtime? Next, does this project require this level of data access, and it’s otherwise not possible to accomplish with the standard v1.1 API? Third, does the applicant understand the sensitivity of this data and have a clear plan of how to handle it in a safe manner compliant with our Developer Policy? Finally, are they planning to use this data to benefit the public good?

Together, those granted access represent 30 different countries, spanning nearly every continent. The majority were using this data for academic research, collectively representing 92 different academic institutions and universities around the world. About 8% of approved uses were for non-academic organizations or independent developers and researchers, who shared similar goals around using this data for good, like building dashboards, apps, tools, and resources free for the public. For example, Clarabridge leveraged this data for their Social Pulse on COVID-19, a part of their information center built to assist people in the customer experience industry and the public.

What are they studying?

Here are just a few examples of what we’ve seen so far.

More than half of those approved for this stream are focused on studying disinformation and misinformation around the facts of coronavirus.

  • Researchers from the University of Washington Center for an Informed Public explored what drove viral misinformation about COVID-19, including how influential people politicized scientific facts.
  • Researchers from Northeastern, Harvard, Northwestern, and Rutgers used this data to examine how misinformation enters the social media ecosystem, how far it spreads, and the types of Twitter accounts that spread it. Their study of over 30 million Tweets found that 80 to 90 percent of “fake news” comes from a few tenths of one percent of all accounts sharing information about the virus. In previous studies, they’ve also explored the relationship between groups likely to share misinformation compared to groups likely to believe it, noting that more research is needed to understand if belief in the information predicates sharing.
  • Researchers in the Department of Computer Science at University of Southern California explored how to identify unreliable or misleading content, patterns in how this information spreads, emerging trends in misleading content about COVID-19 (see their research publication), and identifying coordinated disinformation campaigns (see their research publication).

In most other cases, developers and researchers used this stream to understand public perceptions, sentiment, and the evolution of people’s attitudes about the pandemic over time.

  • Dr. Manlio De Domenico, Head of the CoMuNe Lab with the Bruno Kessler Foundation has used this data to create the COVID-19 Infodemic Observatory. This observatory analyzes geolocalized Tweets, aggregated at the country level, to estimate the fraction of automated posts taking place in the public discussion (e.g., bots), and to estimate the average sentiment of Tweets and volume of reliable sources of information. Their work seeks to quantify the ‘infodemic risk’ of a particular location, and has also been recently published in the Nature journal on Human Behavior. 
  • Researchers at Penn Medicine also used this data to create an in-depth regional map of COVID-19 attitudes and perceptions in the US. The intent of the dashboard is that it can be used to inform potential public policy and health communications. Check out the case study.
  • More recently, we see more researchers shifting from the study of the virus itself (such as a study of reported symptoms), to the study of topics like vaccinations, public safety measures, and economic recovery. 

The future of COVID-19 study with Twitter data

The study of coronavirus and its adjacent topics will continue for quite some time. We’ve observed that at the beginning of the pandemic, much of the work was focused on symptoms, perceptions of the virus, and credibility of new information. Today, much of that conversation has shifted to the societal impacts that this pandemic has had, and perceptions of vaccinations. In all these cases, the Twitter Developer Platform continues to support developers and researchers who want to use it to improve the future.

If you are actively working a research study related to COVID-19, or you wish to work on something related in the near future, be sure to explore our product solutions for Academic Research. If you want to share what you’re doing or ask questions about the process, check out our forum.