OMG-Empathy Prediction

Results 2018

Results of the 2018 OMG-Empathy Prediction Challenge

Date	Team	Submission	Repository	Paper	Personalized Track	Generalized Track	Modality
12.2018	Alpha-City	Manual -	Link	Link	0.17	0.17	Audio+Images+Text
12.2018	Alpha-City	Filters -	Link	Link	0.12	0.12	Audio+Images+Text
12.2018	Alpha-City	KNN -	Link	Link	0.03	0.03	Audio+Images+Text
12.2018	EIHW	Submission1 Generalized model trained with multimodal data using a BLSTM with 40 cell units.	Link	Link	-	0.05	Audio+Images
12.2018	EIHW	Filters Generalized model trained with multimodal data using a BLSTM with 50 cell units.	Link	Link	-	0.06	Audio+Images
12.2018	EIHW	KNN Personalized models trained with multimodal data using a BLSTM with 50 cell units.	Link	Link	-	0.11	Audio+Images
12.2018	USTC-AC	Result1 The first result is predicted by our model trained on stories 2,8.	Link	Link	0.14	0.14	Audio+Images+Time
12.2018	USTC-AC	Result2 The second and third results are predicted by our model trained on stories 1,2,8	Link	Link	0.11	0.11	Audio+Images+Time
12.2018	USTC-AC	Result3 The second and third results are predicted by our model trained on story 1,2,8	Link	Link	0.13	0.13	Audio+Images+Time
12.2018	A*STAR AI	G1_Predictions_AT "# G1: Audio+Text multimodal LSTM with local attention Submission G1 is a multimodal LSTM model using audio and text modalities, with local attention applied to the past 3 seconds. Audio features were extracted with OpenSMILE, and text features were GloVe word embeddings averaged over 1 second chunks. It was our best performing model when Story 1 was used as the validation set, achieving a CCC value of 0.29.".	Link	Link	-	0.14	Audio+Images+Text
12.2018	A*STAR AI	G2_Predictions_T1 "# G2: Text LSTM Submission G2 is a text-only LSTM model. This was our best-performing model on average when we did leave-one-out cross-validation, with an averaged CCC value of 0.133. We chose to use cross-validation to see which combination of features may be most robust when predicting different stories. The predictions we submit are from a text-only model trained only on the train set and achieved a CCC value of 0.183 on the validation set."	Link	Link	-	0.11	Audio+Images+Time
12.2018	A*STAR AI	G3_Predictions_ATV "# G3: Audio+Text+Visual multimodal LSTM with local attention Submission G3 is a multimodal LSTM model using audio, text and visual modalities, with local attention applied to the past 3 seconds. Audio features were extracted with OpenSMILE, text features were GloVe word embeddings averaged over 1 second chunks, and visual features were VGG facial features extracted for each subject (but not the actor). Out of our multimodal models, it had the best cross-validated CCC score of 0.109. On the original validation set (Story 1), it had a CCC score of 0.228."	Link	Link	-	0.07	Audio+Images+Text
12.2018	A*STAR AI	P1_Predictions_AT "# P1: Audio+Text multimodal LSTM, finetuned for each subject Submission P1 contains predictions from a set of audio+text multimodal LSTM models which were fine-tuned from model G1. For each subject, we finetuned model G1 by training for 250 epochs on the videos for that subject, using early-stopping to select the epoch with the highest performing CCC score. Averaged across subjects, the personalized CCC score on the validation set (Story 1) is 0.323.".	Link	Link	0.14	-	Audio+Images+Text
12.2018	A*STAR AI	P2_Predictions_T1 "# P2: Text LSTM, finetuned for each subject Submission P2 contains predictions from a set of text LSTM models which were fine-tuned from model G2. For each subject, we finetuned model G2 by training for 200 epochs on the videos for that subject. Averaged across subjects, the personalized CCC score on the validation set (Story 1) is 0.211."	Link	Link	0.07	-	Audio+Images+Text
12.2018	A*STAR AI	P3_Predictions_ATV "# G3: Audio+Text+Visual multimodal LSTM with local attention Submission P3 contains predictions from a set of audio+text+visual LSTM models which were fine-tuned from model G3. For each subject, we finetuned model G3 by training for 250 epochs on the videos for that subject, using early-stopping to select the epoch with the highest performing CCC score. Averaged across subjects, the personalized CCC score on the validation set (Story 1) is 0.284."	Link	Link	0.07	-	Audio+Images+Text
12.2018	USF Affective Vision	Submission1 -	Link	Link	0.00	0.00	-
12.2018	Affective Bulls	CNN_RF_Fusion -	Link	Link	-	0.03	Audio+Image
12.2018	Affective Bulls	RF_Land_Sub_Act -	Link	Link	-	0.04	Audio+Image
12.2018	Affective Bulls	SubjectActorImages -	Link	Link	-	-0.03	Audio+Image
12.2018	Affective Bulls	SubjectImages -	Link	Link	0.02	-	Audio+Image
12.2018	Baseline	Baseline Barros, P., Barakova, E., & Wermter, S. (2018). A Deep Neural Model Of Emotion Appraisal. arXiv preprint arXiv:1808.00252. --> trained on the OMG-Emotion Recognition dataset.	Link	Link	0.06	0.06	Audio+Image
12.2018	Rosie	SVM For submission 1, valence values are predicted mostly by SVMs that are trained on two features (one visual one semantic).	Link	Link	0.08	0.08	Audio+Image+Semantic
12.2018	Rosie	NeuralNet For submission 2, we used neural networks to predict the valence values, and the neural network is trained on five features (verbal and non-verbal features).	Link	Link	0.07	0.07	Audio+Image+Semantic

OMG-Empathy 2019

Leaderboard

Results 2018

Copyright© - OMG-Challenges 2018