Sarker Lab

Advancing AI in Medicine through Natural Language Processing and Data Science

Research Objectives:

- Build portable, customizable and interpretable systems for medical free text processing.

- Develop end-to-end solutions for medical and public health problems of high significance.

- Create and share text mining, machine learning, natural language processing and artificial intelligence solutions.

Focus Areas

We strive to design and build natural language processing and machine learning frameworks that are portable across medical and public health problems. We put particular attention to ensuring that technological innovations in data science, NLP, machine learning and artificial intelligence comply to the specific needs of the medical domain. These needs include, but are not limited to, interpretability, simplicity, reliability and timeliness. The following are some of our current specific focus areas.


We are building data-centric methods for monitoring, understanding and confronting the substance use epidemic.

Portable NLP

We are continuously developing BioNLP software that are portable across medical domain problems, and don’t live and die with narrow-scope studies. We implement methods for text classification, information detection and extraction, text representation and normalization, topic analyses and visualization.

Social Media Mining

Social networks contain abundant information on every topic. Adoption of social media is now at an all-time high, and the number of people using social media continues to grow. We are innovating strategies for curating and utilizing social media data for medical and public health tasks. We are also continuously exploring new uses for social media data.

Patient-centered NLP

We are striving to answer the question: What is patient-centeredness from the perspective of NLP? Our patient-centered NLP research focuses on enabling NLP-driven evidence-based medicine practice.

NLP for Cancer

We are developing social media based NLP pipelines to help fight cancer. We are specifically interested in studying patient reported outcomes (PROs), so that we can better understand the outcomes that matter to cancer patients.