MIN Faculty
Department of Informatics
Knowledge Technology


This webpage is outdated and has moved. Please find the official Knowledge Technology page at:


CROSS: Cross-modal Learning

CROSS is a start project funded by the Landesforschungsförderung

Coordinators: Prof. Dr. Stefan Wermter, Prof. Dr. Jianwei Zhang.
Collaborators: Prof. Dr. Brigitte Röder, Prof. Dr. Andreas K. Engel.


CROSS is a project aiming to prepare and initiate research between the life sciences (neuroscience, psychology) and computer science in Hamburg and Beijing in order to set up a collaborative research centre positioned interdisciplinarily between artificial intelligence, neuroscience and psychology while focusing on the topic of cross-modal learning. Our long-term challenge is to understand the neural, cognitive and computational evidence of cross-modal learning and to use this understanding for (1) better analyzing human performance and (2) building effective cross-modal computational systems.


Project objectives

The long-term goal of our research is to understand the neural, cognitive and computational mechanisms of cross-modal learning and to use this understanding for (1) enhancing human performance and (2) building artificial cross-modal systems with behavior and performance similar to that of animals and humans. The term "cross-modal learning" refers to the fusion of complementary information from multiple sensory modalities in such a way that the learning that occurs within any individual sensory modality can be combined with or enhanced by information from one or more other modalities. Cross-modal learning is crucial for human understanding of the world, and the effective human acting in a complex world. Examples are ubiquitous, such as: learning to grasp and manipulate objects, learning to walk, learning to read and write, learning to understand language and its referents, etc. In all these examples, visual, auditory, somatosensory or other modalities must be integrated, and learning must be cross-modal. In fact, the broad range of acquired human skills are cross-modal, and many of the most advanced human capabilities, such as those involved in social cognition, require learning from the richest combinations of cross-modal information.

This project pursues four key objectives which specify the goals for the preparatory work needed to establish the planned Hamburg-Beijing collaborative research centre.


Related Publications

Bauer, J., Dávila-Chacón, J., Wermter, S. Modeling development of natural multi-sensory integration using neural self-organisation and probabilistic population codes. Connection Science, pp. 1-19, Taylor & Francis, London, October, 2014.


Heinrich, S., Wermter, S. Interactive Language Understanding with Multiple Timescale Recurrent Neural Networks. In Wermter, S., et al., editors, Proceedings of the 24th International Conference on Artificial Neural Networks (ICANN 2014), pp. 193-200, Springer Heidelberg. Hamburg, DE, September 2014.


Kerzel, M., & Habel, C. (2013). Event Recognition during the Exploration of Line-Based Graphics in Virtual Haptic Environments. In Spatial Information Theory (pp. 109-128). Springer International Publishing.