There are at least three types of models out there: 1. auditory models, 2. motor models, and 3. sensory-motor models.
Here's my simplified cartoon of an auditory model:
This is closest to my view. The access route from sound input to the conceptual system does not flow through the motor system although the motor system can modulate activity in the sensory system.
Here's a cartoon of a motor theory:
Something like this has been promoted by Liberman in the form of the Motor Theory of speech perception, as well as by Fadiga. One comment I'm getting a lot lately (including from Luciano) is that no one really believes in the motor theory. So here's a quote from the Fadiga & Craighero, Cortex, (2006) 42, 486-490:
According to Liberman’s theory … the listener understands the speaker when his/her articulatory gestures representations are activated by the listening to verbal sounds. p. 487
Liberman’s intuition … that the ultimate constituents of speech are not sounds but articulatory gestures that have evolved exclusively at the service of language, seems to us a good way to consider speech processing in the more general context of action recognition. p. 489
On this view, the route from acoustic speech input to the conceptual system flows through the motor system.
Here is my cartoon of a sensory-motor model:
This seems to be what Fadiga has in mind based on his comments on this blog, namely that it is in the "matching" of the sensory and motor systems that is critical for recognition to happen.
A Brad Buchsbaum pointed out, both a motor theory and a sensory-motor theory would predict that damage to the motor-speech system should produce substantial deficits in speech recognition. As this prediction doesn't hold up empirically, these theories in their strong forms are wrong.