The ability to perceive, understand and respond to social interactions in a human-like manner is one of the most desired capabilities in artificial agents, particularly social robots. These skills are highly complex and require focus on several different aspects of research, including affective understanding. An agent which is able to recognize, understand and, most importantly, adapt to different human affective behaviors can increase its own social capabilities by being able to interact and communicate in a natural way.
Emotional expression perception and categorization are extremely popular in the affective computing community. However, the inclusion of emotions in the decision-making process of an agent is not considered in most of the research in this field. To treat emotion expressions as the final goal, although necessary, reduces the usability of such solutions in more complex scenarios. To create a general affective model to be used as a modulator for learning different cognitive tasks, such as modeling intrinsic motivation, creativity, dialog processing, grounded learning and human-level communication, only emotion perception cannot be the pivotal focus. The integration of perception with intrinsic concepts of emotional understanding, such as a dynamic and evolving mood and affective memory, is required to model the necessary complexity of an interaction and realize adaptability in an agent's social behavior.
Such models are most necessary for the development of real-world social systems, which would communicate and interact with humans in a natural way on day-to-day basis. This could become a next goal for research on Human-Robot Interaction (HRI) and could be an essential part of a next generation of social robots.
For this challenge, we designed, collected and annotated a novel corpus based on human-human interaction. This novel corpus builds on top of the experience we gathered while organizing the OMG-Emotion Recognition Challenge, making use of state-of-the-art frameworks for data collection and annotation.
The One-Minute Gradual Empathy datasets (OMG-Empathy) contain multi-modal recordings of different individuals discussing predefined topics. One of them, the actor, shares a story about themselves while the other, the listener, reacts to it emotionally. We annotated each interaction based on the listener's own assessment on how they felt while the interaction was taking place.