A reasonable response to data such as these is to acknowledge that speech perception can happen with the auditory system alone. With that as our limiting case, if you want to explore the role of the motor system in speech perception, it will have to be a much more nuanced contribution, e.g., that the motor system somehow contributes a little but under some circumstances. I've acknowledged this possibility. From Hickok et al. 2009:
the claim for the ‘necessity’ of the motor system in speech perception seems to boil down to 10 percentage points worth of performance on the ability to discriminate or judge identity of acoustically degraded, out of context, meaningless syllables – tasks that are not used in typical speech processing and that double-dissociate from more ecologically valid measures of auditory comprehension even when contextual cues have been controlled. This suggests a very minor modulatory role indeed for the motor system in speech perception.Ok, so that's a little snarky for an acknowledgement. Here's another that's more measured from Hickok et al. 2011:
we propose that sensorimotor integration exists to support speech production, that is, the capacity to learn how to articulate the sounds of one’s language, keep motor control processes tuned, and support online error detection and correction. This is achieved, we suggest, via a state feedback control mechanism. Once in place, the computational properties of the system afford the ability to modulate perceptual processes somewhat, and it is this aspect of the system that recent studies of motor involvement in perception have tapped into.
I'm not sure I agree with myself any more as the evidence for a modulatory role under ecologically valid listening conditions is extremely weak. For example, Pulvermuller and colleagues took the task issue complaints seriously and performed a TMS study using comprehension as their measure. This study failed to replicate the effect on accuracy of speech perception found with discrimination or identification tasks but did find an RT effect that held for some sounds and not others. See my detailed comments on this study here.
But back to SDL and chinchillas. What is their take on these facts? Here's what they say:
Though these categorical speech perception studies are often revered because they suggest the reality of speech units like phonemes, they have been criticized. Problems include that the tasks assume the units under study and that within category differences are actually readily discernible and meaningfulI agree with both the idea that these studies don't necessarily imply that the phoneme (or segment) is a unit of analysis in perception or that listeners can't hear within category differences (see Massaro's critiques of categorical perception). But that doesn't make the similarity between the human and chinchilla curve evaporate. No matter what unit is being analyzed or whether within-category differences can be detected under other task conditions, it still remains that chinchilla's can hear subtle differences between speech sounds. SDL's critique is tangential.
SDL then turn to a line of argumentation that makes no sense to me. The write of the claim from animal work that
Neurobiologically, the argument is unsound because, in the early work frequently used to support this argument... the brain was not directly observed.The claim is not neurobiological. It is functional. Neurobiology doesn't matter for the structure of the argument: if an animal cannot produce speech yet can perceive it, it follows that you don't need to be able to produce speech to perceive it. Period. But let's read on:
Yet it has been suggested that premotor cortex is involved in processing sounds that we cannot produce in ways that make use of the underlying computational mechanisms that would also be involved in movementSo this implies that motor plans for non-speech actions are sufficient for perceiving speech. So, assuming that SDL buy into the broader claims that action understanding for say grasping and speech is achieved via motor simulation, what they are actually saying is that when a chinchilla perceives a speech sound it resonates with a motor network for some non-speech actions (biting?) and this somehow results in the correct perception of the speech sound (for which there is no motor plan) instead of the motor plan that it actually resonated with. Hmm. Isn't is a bit more parsimonious to assume that the two sounds are acoustically different and that the chinchilla's auditory system can detect and represent that difference?
If we are going to accept a hypothesis that deviates substantially from parsimony, we're going to need some very strong evidence. SDL highlight the fact that premotor areas of nonhuman primates activate during the perception of sounds they cannot produce. But again there is a more parsimonious explanation. The brain needs to map all sorts of perceptual events onto action plans for responding to those events. If you see a snake coiling and rattling its tail, you need to map that percept onto movement plans for getting out of the way. Presumably, your premotor cortex would be activated by that percept even though you have no motor programs for coiling and tail rattling. The same mechanism can explain the data SDL mention.
SDL also highlight that "Bruderer et al. (2015) showed that perturbing the articulators of
6-month old infants disrupts their ability to perceive speech sounds." But the study is confounded by differences in the amount of distraction that the methods of perturbation likely causes. Here are the teethers they used. Which one would you guess is more annoying to the infant? Once you've made your guess, go read the paper and see which one caused the speech perception decline.
In sum, SDL make a convoluted argument to salvage the idea that the motor system is responsible for the perception of speech even in animals and pre-lingual infants. A much simpler explanation exists: auditory speech perception is achieved by the auditory system, which is present in adult humans, prelingual infants, and chinchilla's, all of which can perceive speech surprisingly well.