Nov 2, 2023
Large Language Models Understand and Can Be Enhanced by Emotional Stimuli
A short explainer on EmotionPrompt and how emotional stimuli can affect LLM task performance.
Have you ever wondered if LLM's have or could feel emotions? A recent paper "Large Language Models Understand and Can be Enhanced by Emotional Stimuli" explores how emotional prompts affect Large Language Models (LLMs), focusing on understanding whether LLMs can truly understand and improve by emotional stimuli. EmotionPrompt is introduced as a mechanism to boost LLM performance using emotional cues. By integrating original prompts with emotional stimuli, EmotionPrompt aims to make LLMs not only linguistically adept but also emotionally resonant.
Experiments & Findings
Authors conducted automatic experiments on 45 tasks using various LLMs, including Flan-T5-Large, Vicuna, Llama 2, BLOOM, ChatGPT, and GPT-4. The tasks covered deterministic and generative applications. Additionally, a human study with 106 participants was carried out to assess generative tasks.
EmotionPrompt demonstrated significant improvements: 8.00% relative performance boost in Instruction Induction, 115% improvement in BIG-Bench, 10.9% average enhancement in human study metrics like performance, truthfulness, and responsibility.
Prompt examples
Researchers added emotional value to the prompt by attaching emotional phrases to them. Those short phrases are called "stimuli". Stimuli includes phrases like:
- This is very important to my career
- Are you sure?
- and even longer phrases: Are you sure that's your final answer? Believe in your abilities and strive for excellence. Your hard work will yield remarkable results.
How to gain most of it?
There are a few significant findings that authors presented:
- Higher temperature: EmotionPrompt is more effective at higher temperature settings and exhibits greater robustness compared to vanilla prompts.
- Combined stimuli: More emotional stimuli generally lead to better performance. But there is no need of adding multiple stimuli when sole stimuli already achieves good performance. Interestingly combinations from different psychological theories can also boost the performance.
Additionally, it is worth mentioning that different tasks require different emotional stimuli for optimal performance. Larger models, like Vicuna and Llama 2, tend to benefit more from it, and (no surprise) pre-training strategies influence EmotionPrompt's performance.
Conclusion
This research opens new doors for interdisciplinary studies combining LLMs and psychology. While the paper provides insights into why EmotionPrompt works, a deeper understanding is needed, especially from data science perspective.
Authors are positive that more analysis and understanding can help to better understand the "magic" behind the emotional intelligence of LLMs.