In artificial visuo-haptic object recognition systems, vision and haptics are usually modeled as separate processes. This, however, is far from what actually happens in the human brain, the object recognition capabilities of which we would like such systems to have: A lot of multimodal and crossmodal interactions take place between the two sensory modalities there. Generally, three main principles can be identified as underlying the processing of the visual and haptic object-related stimuli: 1. Hierarchical processing 2. the divergence of the processing onto substreams for object shape and material perception, and 3. the experience-driven self-organization of the integratory neural circuits. The question arises whether an object recognition system can benefit in terms of performance from following a more brain-inspired strategy to combining the visual and haptic input, which we set out to answer in this thesis. We compare the brain-inspired integration strategy with conventionally used integration strategies on data collected with a robot that was enhanced with inexpensive contact microphones as tactile sensors. The results of our experiments involving eleven every-day objects show that 1. the contact microphones constitute a good alternative to capturing tactile information and that 2. the brain-inspired strategy is more robust to unreliable inputs.