Are sight and sound out of sync?
New research suggests speech comprehension improves with a slight auditory delay in some individuals
A new paper has shed light on sensory timing in the brain by showing that there is a lag between us hearing a person’s voice and seeing their lips move, and these delays vary depending on what task we’re doing.
The research from City, University of London – co-authored by academics from University of Sussex, Middlesex University and Birkbeck, University of London – also showed potential benefits of tailoring the time delay between audio and video signals to each individual, finding that speech comprehension improved in 50 per cent of participants by 20 words in every 100, on average. The paper is published in the journal Scientific Reports.
Dr Elliot Freeman, author of the paper and Senior Lecturer in Psychology at City, University of London, said: “Our study sheds new light on a controversy spanning over two centuries, over whether senses can really be out of sync in some individuals. Speech comprehension was at its best in our study when there was a slight auditory lag, which suggests that perception can actually be less-than-optimal when we converse or watch TV with synchronised sound and vision. We think that by tailoring auditory delays to each individual we could improve understanding and correct their sub-optimal perception.”
To investigate the effect, the team showed 36 participants audiovisual movies depicting the lower half of a face speaking words or syllables, with the audio degraded by background noise. Stimuli were presented with a range of audiovisual asynchronies, spanning nine equally spaced levels from 500ms auditory lead to 500ms auditory lag, including simultaneous. In two separate tasks, participants had to identify which phoneme (‘ba’ or ‘da’), or which word the speaker said.
Our work might be translated into novel diagnostics and therapeutic applications benefiting individuals with dyslexia, autism spectrum or hearing impairment - Dr Elliott Freeman
They team found that there was an average auditory lag of 91ms for the phoneme identification task (based on the famous McGurk effect) and 113ms for word identification, but this varied widely between individuals.
Interestingly, the measurements from each task for individual participants were similar for repetitions of the same task, but not for different tasks. This suggests that we may have not just one general individual delay between seeing and hearing, but different ones for different tasks.
The authors speculate that these task-specific delays arise because neural signals from eyes and ears are integrated in different brain areas for different tasks, and that they must each travel via different routes to arrive there. Individual brains may differ in the length of these connections.
Dr Freeman said: “What we found is that sight and sound are in fact out of sync, and that each time delay is unique and stable for each individual. This can have a real impact on the interpretation of audiovisual speech.
“Poor lip-sync, as often experienced on cable TV and video phone calls, makes it harder to understand what people are saying. Our research could help us better understand why our natural lip-syncing may not always be perfect, and how this may affect everyday communication. We might be able to improve communication by artificially correcting sensory delays between sight and sound. Our work might be translated into novel diagnostics and therapeutic applications benefiting individuals with dyslexia, autism spectrum or hearing impairment.”
See a visual demonstration of the test