Emotion Recognition Capabilities of Large Language Models: A Comparative Analysis


Emotion Recognition Capabilities of Large Language Models: A Comparative Analysis

Diatlinko E.S. (ISP RAS, Moscow, Russia; MSU, Moscow, Russia)
Pavlov M.D. (ISP RAS, Moscow, Russia)
Tigranyan S.T. (RAU, Yerevan, Armenia)
Avetisyan A.A. (ISP RAS, Moscow, Russia)

Abstract

Large language models (LLMs) are increasingly integrated into conversational systems, where understanding emotional cues is essential for maintaining coherent, engaging, and safe interactions. This study evaluates how effectively modern instruction-tuned large language models (LLMs) can recognize emotions from text only without task-specific fine-tuning. We benchmark multiple open-weight LLM families (<15B parameters) across four prompting strategies – Baseline, Context, Few-shot, and Context+Few-shot – on two English ERC benchmarks (IEMOCAP, MELD) and one Russian dataset (RESD). We find that the optimal prompting strategy is dataset-dependent: semantically redundant data such as IEMOCAP benefits most from few-shot demonstrations (best 73.3% weighted F1-score (WF1) with Context+Few-shot), whereas MELD gains primarily from incorporating dialogue history (best 60.3% WF1 with Context). Robustness experiments show that LLMs are largely insensitive to reordering few-shot examples, but performance degrades substantially when the label space is corrupted, indicating that coherent labels space matters more than order of examples or their ground truths. Cross-lingual evaluation reveals a notable drop on Russian RESD (best 45.8% WF1), highlighting a persistent gap between English and Russian affect understanding in current LLMs. Overall, non-finetuned LLMs serve as strong prompt-only baselines for ERC, yet remain clearly behind specialized supervised systems.

Keywords

large language models; emotion recognition; robustness; few-shot learning; emotion understanding.

Edition

Proceedings of the Institute for System Programming, vol. 38, issue 3, part 4, 2026, pp. 157-174

ISSN 2220-6426 (Online), ISSN 2079-8156 (Print).

DOI: 10.15514/ISPRAS-2026-38(3)-53

For citation

Diatlinko E.S., Pavlov M.D., Tigranyan S.T., Avetisyan A.A. Emotion Recognition Capabilities of Large Language Models: A Comparative Analysis. Proceedings of the Institute for System Programming, vol. 38, issue 3, part 4, 2026, pp. 157-174 DOI: 10.15514/ISPRAS-2026-38(3)-53.

Full text of the paper in pdf Back to the contents of the volume