Visual-auditory integration

In a 2001 study, G. F. Meyer and S. M. Wuerger examined the nature of visual-auditory integration in the perception of motion events (For another study on visual-auditory integration, see the McGurk Effect). Earlier studies by other researchers had indicated a link between visual and auditory information in motion event processing using evidence from neurophysiological data and data from behavioral studies. In the former, it was seen that neurons in the superior colliculus, an area of the brain directly between the visual and auditory cortices, responded to stimulation from multiple modalities and that the receptive fields of these neurons were spatially aligned in all modalities. In the latter category, behavioral studies mentioned by Meyer & Wuerger have shown cross-modal links in both endogenous and exogenous spatial attention for sequential and simultaneous stimuli in both the auditory and visual modalities.

In the present study, Meyer & Wuerger set out to provide further evidence for these cross-modal links between the visual and auditory systems. More specifically, Meyer & Wuerger wanted to answer the following questions 1) Can subjects be biased towards a particular interpretation of visual stimuli by auditory stimuli? 2) Can subjects' sensitivity towards visual stimuli be affected by auditory stimuli?

The Experiment

Visual Stimuli

The visual stimuli consisted of a random dot kinematogram (RDK), a black square on which 500 white dots would move from a random start position across a line trajectory in a random direction. The researchers varied the coherence, the proportion of dots moving in the same direction; for example, an RDK on which all of the dots were moving from right to left would have 100% coherence, while an RDK on which none of the dots were moving in the same direction would have 0% coherence. For this experiment, subjects either saw an RDK with 0% coherence or one with a very low level of coherence. The RDK with 0% coherence was selected to help the researchers see whether subjects could be biased towards seeing a particular non-existent pattern in the RDK by the auditory stimuli. The RDKs with low levels of coherence would help the researchers see the effects of auditory stimuli on subjects' sensitivity towards perception of the negligible, but existent, patterns in the RDK.

Auditory Stimuli

The auditory stimuli consisted of white noise cross-faded between two speakers placed behind the visual stimuli and invisible to the subject. The researchers varied both the placement of the speakers and the speed at which the white noise was cross-faded from one speaker to the other. The conditions for this experiment were as follows:


A) Audio matches visual for both position and speed (speakers are placed directly behind RDK, speed of cross-fade matches that of the dots' motion). B and C) Audio matches visual for speed, but not for position (speakers are placed to the left of the RDK - Condition B - or to the right of the RDK - Condition C - but the speed of the cross-fade matches that of the dots' motion). D) Audio matches visual for position, but not for speed (speakers are placed behind the RDK, but the speed of the cross-fade is faster than that of the dots' motion)


What Meyer & Wuerger found was that they could indeed introduce a bias towards one interpretation or another when the subjects viewed an RDK with 0% coherence; if the audio moved from the left speaker to the right speaker, subjects were more likely to report that the dots in the RDK were moving left to right, and vice versa. This held true even in conditions B through D, when the auditory stimuli did not match the visual stimuli either for speed or position. However, when it came to increasing subject's sensitivity to the subtle patterns in the RDKs with low coherencies, Meyer & Wuerger found an effect only when the audio matched the visual for both position and sound.

Meyer & Wuerger conclude, based on these results, that it is unlikely that facilitating audio-visual integration effects exist for perception of global object motion at moderate speeds, but it's possible that audio-visual integration can benefit other visual functions, like control of eye movements.

Infant Studies

Meyer & Wuerger's results can be related to a study by Hollich, Newman, & Jusczyk (2005) in which seven-and-a-half-month-old children were only able to segment words presented in auditory stimuli in the presence of a simultaneous distractor auditory stream (a man's voice reading a different passage in a monotone) when the audio was synchronized to the simultaneous visual display. This held true both when the visual stimuli consisted of the face of a woman reading various passages (the auditory stimuli) in infant-directed speech and when the visual stimuli consisted of an oscilloscope matched to the target audio (the woman's voice reading passages in IDS).

What we can extract from both these studies is that the human brain, via the perceptual system, is inclined to integrate information from multiple modalities in order to form cohesive 'pictures' of the events around us, but that it will only allow a certain degree of discrepancy before it will no longer match information from multiple modalities together. In the Hollich et al. article, the researchers postulated that infants' ability to perform the task indicated that infants use synchronized (see neural synchrony) visual and auditory information to figure out who is talking; combined with the Meyer & Wuerger results, we might say that this indicates a more general tendency in both infants and adults to both determine whether or not information from multiple modalities can be interpreted as emanating from the same source and to thus direct attention.

Back to Multi-Modal Integration
Back to Main