For the last few years I have been thinking a lot about a few different things: What specifically is our proposed dorsal stream doing? How does the motor system contribute to speech perception? What is the relation between sensorimotor processes used during speech production (e.g., feedback-based motor control models) and purported sensorimotor processes in speech perception? How do computational models of speech production (e.g., feedback control models, psycholinguistic models, neurolinguistic models) relate to neural models of speech processing? A new "Perspective" article which just appeared today in Neuron and is currently free to download, summarizes the outcome of my thoughts on these questions, mixed with a heavy dose of input from my co-authors John Houde (UCSF) and Feng Rong (Talking Brains West post doc). Yes, I'm very proud that the piece has been labeled a "perspective" rather than a "review" -- that means it is theoretically novel rather than a summary statement ;-)
The starting point for the article is the observation that there are two main lines of research on sensorimotor integration, which paradoxically do not interact. Namely the idea that the auditory system is critically involved in speech production (exemplified by the motor control folks like Frank Guenther and John Houde) and the idea that the motor system is critically involved in speech perception (exemplified by folks like Stephen Wilson, Pulvermuller, and many others). We wondered whether these two lines of work could be integrated into one model. The answer, we propose, is yes.
The basic idea is the dorsal stream sensorimotor integration circuit is built to support speech production via a state feedback control architecture of the sort that is common in the visual-manual motor control literature. But the computational properties of the system, particularly the generation of forward sensory predictions of motor consequences provides a ready made mechanism for the motor system to modulate (not drive) the perception of others' speech under some circumstances (e.g., when the acoustic signal is weak or ambiguous).
In addition, we attempted to show how psycholinguistic models of speech production (e.g., Levelt, Dell) as well as neurolinguistic models (e.g., the concept of input and output phonological lexicons) relate to the proposed state feedback control model. I never liked the idea of there being two phonological "lexicons" but it actually makes a lot of sense in the framework of state feedback control architectures.
The model also does a decent job of explaining some of the key symptoms of conduction aphasia and stuttering which are explained as different types of disruption of the same feedback control mechanism.
The graphic depiction of the model is below. I'm looking forward to your feedback on this!
Hickok, G., Houde, J., & Rong, F. (2011). Sensorimotor Integration in Speech Processing: Computational Basis and Neural Organization Neuron, 69 (3), 407-422 DOI: 10.1016/j.neuron.2011.01.019