您的位置 首页 > 腾讯云社区

视频中多模态情绪识别的上下文动态(multimedia)---用户6869393

情感表达是当今数字平台上用户行为的重要组成部分。虽然多模态情绪识别技术越来越受到人们的关注,但是对于如何在特定的环境下更好地识别情绪,而不是在其他环境下,使用视觉和非视觉特征,人们还缺乏更深入的理解。本研究结合两个关键的语境因素:1)说话人的性别,2)情绪插曲的持续时间,分析由面部表情、语气和文本衍生的多模态情绪特征的影响之间的相互作用。通过使用超过2500个YouTube视频的大数据集,我们发现,尽管多模态特征的表现始终优于双模态和单模态特征,但它们在不同的情绪、性别和持续时间方面表现出显著差异。研究发现,在识别除恐惧外的大多数情绪方面,男性的多模态特征表现得特别好。此外,多模态特征在识别中性、快乐和惊讶方面表现得特别好,在识别悲伤、愤怒、厌恶和恐惧方面表现得特别好。这些发现为开发更能感知情境的情感识别和移情系统提供了新的见解。

原文题目:The Contextual Dynamics of Multimodal Emotion Recognition in Videos

原文:Emotional expressions form a key part of user behavior on today’s digital platforms. While multimodal emotion recognition techniques are gaining research attention, there is a lack of deeper understanding on how visual and non-visual features can be used in better recognizing emotions for certain contexts, but not others. This study analyzes the interplay between the effects of multimodal emotion features derived from facial expressions, tone and text in conjunction with two key contextual factors: 1) the gender of the speaker, and 2) the duration of the emotional episode. Using a large dataset of more than 2,500 manually annotated videos from YouTube, we found that while multimodal features consistently outperformed bimodal and unimodal features, their performances varied significantly for different emotions, gender and duration contexts. Multimodal features were found to perform particularly better for male than female speakers in recognizing most emotions except for fear. Furthermore, multimodal features performed particularly better for shorter than for longer videos in recognizing neutral, happiness, and surprise, but not sadness, anger, disgust and fear. These findings offer new insights towards the development of more context-aware emotion recognition and empathetic systems.

原文作者:Prasanta Bhattacharya, Raj Kumar Gupta, Yinping Yang

原文链接:https://arxiv.org/abs/2004.13274

视频中多模态情绪识别的上下文动态.pdf ---来自腾讯云社区的---用户6869393

关于作者: 瞎采新闻

这里可以显示个人介绍!这里可以显示个人介绍!

热门文章

留言与评论(共有 0 条评论)
   
验证码: