COVID-19 Social Media Data Bundle

This page contains resources associated with multiple studies (completed and ongoing) focusing on COVID-19.

Research topics/studies:

We are actively conducting research on a variety of covid-related studies. These studies are in collaboration with researchers from Emory University (Medicine, Public Health & Nursing), Georgia Department of Public Health (GDPH) and the Centers for Disease Control and Prevention (CDC). See summary below:

  • Syndromic surveillance: we are utilizing real-time social media data, natural language processing and machine learning methods to identify and track symptom distributions over time.

  • Localized outbreak detection: we are combining retrospective data from social media with data from GDPH to train AI algorithms that can detect patterns in social media chatter indicative of localized outbreaks.

  • Toxicosurveillance: we are building methods that can detect unapproved treatments that are promoted for treating COVID to identify potential toxic exposures.

  • MOUD treatment access during COVID: in line with our past work, we are studying how COVID is affecting SUD and OUD treatment programs.

  • COVID and mental health: we are studying the impact of COVID on mental health using social media mining methods.


Sarker A, Lakamana S, Hogg-Bremer W, Xie A, Al-Garadi MA, Yang YC. Selt-reported COVID-19 symptoms on Twitter: An analysis and a research resource. [in press]. Preprint


Twitter COVID-19 Symptom Lexicon

Expanded (automatically; via a data-driven method) Symptom Lexicon [Publication Forthcoming].

Last updated: May 28th, 2020.