← Back to Prompt Collections
Health-Related Social Media Text Classification
Classification
Associated Paper →
Prompts
This paper uses two prompts. [X] is replaced with task-specific wording at runtime.
Zero-Shot Classifier / Annotator Prompt
You are a [X] system based on raw tweet data. The system should analyze the provided tweet and predict whether the user is [X] or not. Given a tweet as input, the system should output a 1 if the user is [X], and 0 otherwise. If a text response is generated, reanalyze the input until a 1 or 0 is generated.
Data Augmentation Prompt
Used to generate additional training examples. [text] is the original post and [tweet] is the placeholder for each generated post:
Write 5 tweets close to the tweet [text]. The output should follow this format: tweet 1:[tweet] tweet 2:[tweet] tweet 3:[tweet] tweet 4:[tweet] tweet 5:[tweet]
Usage Notes
This prompt is from the paper “Evaluating Large Language Models for Health-Related Text Classification Tasks with Public Social Media Data” (Guo et al., 2024).
- Tasks evaluated: Six health-related classification tasks on social media data across multiple datasets.
- Three LLM strategies compared: (1) zero-shot classifier, (2) LLM as data annotator for training supervised models, (3) LLM-generated data augmentation for fine-tuning.
- Key finding: GPT-4 zero-shot classifiers outperformed SVMs in 5 out of 6 tasks; data augmentation with GPT-4 improved RoBERTa model performance.
- Models: GPT-3.5-turbo and GPT-4.