Last Update: August 27, 2019

According to the World Health Organization, toxicovigilance is the active process of identifying and evaluating the toxic risks existing in a community, and evaluating the measures taken to reduce or eliminate them. Our toxicovigilance research focuses on prescription and illicit drug use/misuse and drug use disorder.

Currently, our work is primarily funded by the National Institute on Drug Abuse (NIDA) of the National Institutes of Health (NIH). This project primarily focuses on characterizing prescription drug misuse/abuse/nonmedical use by mining social media big data. We are (i) building close to real-time monitoring systems so that we can forecast potential future crises, (ii) developing methods to characterize prescription drugs based on their reported abuse/misuse, (iii) studying potential long-term impacts of drug use disorder and the natural history of addiction, and (iv) empowering toxicologists with information mined from social media so that they can take the necessary steps to help people suffering from opioid use disorder.

NIH-specific information about the project can be found HERE. We also received a small amount of funding (for annotation) and data from the PA CURE project.


Our work was featured on Popular Science (with some fair criticism and skepticism).
Our work was featured on The Emory Health Science Blog .
Our paper on JAMA Network Open shows that we can potentially combine publicly available Twitter data, geospatial information, temporal information, natural language processing and applied machine learning to predict the status of the opioid crisis at a specific place (county and substate) within the U.S.A.
Our MedInfo-2019 paper discusses effective data collection strategies for opioids from Twitter using NLP methods to generate common misspellings and supervised machine learning for filtering out noise.
Our JAMIA paper reviews the literature on social media mining for prescription medication use/misuse. We propose a simple data-centric framework that is suitable for social media data. The particularly important aspect is filtering irrelevant information via the use of supervised classification.
Our JMIR paper provides a detailed description of the importance of thorough annotation guidelines. The paper also contains publicly available data, the annotation guidleine and other resources.

NIH Abstract

The problem of prescription medication (PM) abuse has reached epidemic proportions in the United States. According to a 2014 report by the Director of the National Institute on Drug Abuse (NIDA), an estimated 52 million people, have been involved in the non-medical use of PMs— a significant portion of which can be classified as abuse. PMs that are commonly abused include opioids, central nervous system depressants and stimulants, and the consequences of their abuse may be severe. Increases in PM misuse and abuse over the last 15 years have resulted in increased emergency department visits, rates of addiction and overdose deaths. Due to the rapidly escalating morbidity and mortality, it is now receiving national attention. The opioid crisis, which has its root in opioid-based PM abuse, has been declared a national emergency by the president of the United States. Despite the problems associated with PM abuse, surveillance programs such as prescription drug monitoring programs (PDMPs) are inadequate and suffer from numerous shortcomings, thus limiting their usefulness in real life. Studies evaluating the long-term effects of distinct classes of PMs on cohorts of abusers are scarce and expensive to conduct. To better characterize the problem and to monitor it in real-time, new sources of information need to be identified and novel monitoring techniques need to be developed. To address these problems, our project aims to utilize social media data for performing toxicovigilance. Social media encapsulates an abundance of knowledge about PM abuse and the abusers in the form of noisy natural language text. At the heart of the proposed approach is a machine learning system that can automatically distinguish between abuse and non-abuse indicating user posts collected from social media. Using this classification system, users will be categorized into multiple groups—(i) abusers, (ii) medical users and (iii) non users. The developed system will collect longitudinal data for users exposed the selected PMs via periodic collection of their publicly available posts/discussions and automatically categorize them based on age, gender and additional demographic feature, when possible. This will enable the conducting of observational studies on targeted cohorts, involving hundreds of thousands of cohort members. The cohort studies will focus on analyzing the transition rates from medical use to abuse for distinct PMs and transition rates from abuse of PMs to illicit analogs. Implementation of this data-centric framework, which will be open source, will revolutionize the mechanism by which PM abuse monitoring is performed and enable the future development of intervention strategies targeted towards specific cohorts, at the most effective time periods.

Public Health Relevance Statement

Prescription Medication (PM) abuse is a major epidemic in the United States, and monitoring and studying the characteristics of the PM abuse problem requires the development of novel approaches. Social media encapsulates an abundance of data about PM abuse from different demographics, but extracting that data and converting it to knowledge requires advanced natural language processing and data-centric artificial intelligence systems. Our proposed social media mining framework will automate the process of big data to knowledge conversion for PM abuse, providing crucial insights to toxicologists about targeted populations and enabling the future development of directed intervention strategies.

Funding and Disclosures

  • National Institute on Drug Abuse (NIDA) of the National Institutes of Health (NIH)
  • Pennsylvania Department of Health

Disclosure: The published contents are solely the responsibilities of the authors of the publications.