Tuesday, March 24, 2009

D'Ausilio et al.'s response regarding the role of motor cortex in speech perception

Let's go through D'Ausilio et al.'s response to my commentary point by point and see if there is anything valid.

Dr. Hickok thinks "we are perfectly capable of perceiving tog when it is presented in its acoustically unambiguous form." In our study we show exactly that, for this to happen, activity in the motor system needs to be consistent with the information reaching the temporal lobes. If this condition is not met, we may mistake tog for tod. As this happens, in spite of the obvious fact that the ears are indeed not attached to the motor system, we concluded that motor systems interact with superior-temporal cortex in the speech perception process.

First, no one doubts (that I know of) that the motor system can interact with superior temporal cortex. The issue is whether the motor system is required for speech perception to happen. Regarding my specific point that we hear tog as tog when it is presented in an acoustically unambiguous form, D'Ausilio et al. seem to be suggesting that the motor system is required for this to happen. Yet in their article they state, "In order to avoid ceiling effects in the phoneme identification task, we immersed vocal recordings in 500 ms of white noise." Hmmm. These ceiling effect would be what exactly? That subjects would hear the phonemes exactly as they are presented acoustically despite motor stimulation? Apparently, their own study demonstrates my point.

They go on to say:

One may conceptualize the underlying mechanisms as similar to attentional influences, stemming from the bidirectional feedback and feedforward connections [1] between superior-temporal and motor systems, and leading to an enhancement of superior-temporal activation as a consequence of the joint system they encompass [2].

Sounds right to me! In fact, this is pretty much what I said in my commentary: "there is strong evidence that motor-related systems are not fundamental to speech perception, but instead, simply modulate the process in some way."

Another important point we would like to stress is that, although we apply TMS on M1, we explicitly state in our paper that areas adjacent to M1 may be critically involved in speech perception.

My arguments are not aimed directly at M1 but a motor systems more generally.

The striking finding is, however, that the facilitation and disfacilitation is manifest in a somatotopic manner, yielding double dissociations on accuracies and reaction times, thus demonstrating a causal relationship between motor and acoustic mechanisms.

Yes, this is a very nice finding, and yes it does show that motor stimulation can influence speech perception. But again that is not what the argument is about.

Here's where it starts to get interesting:

..old neurological models [4, but see 5 for a critical historical commentary], and equally the proposal by Hickok [6], have denied a necessary role of the motor system in speech perception. This is in contrast with evidence from the aphasia literature, where it had been known for a long time that aphasia, even if its underlying lesion is restricted to the frontal cortex, is a general multimodal deficit affecting both the production of speech and its perception and comprehension [7]. Clinical tests for selecting aphasics from other brain-damaged individuals include, thus, speech comprehension test [8].

Reference #7 is to a clinical textbook on aphasia (Rosenbek, J. C., LaPointe, L. L., & Wertz, R. (1995). Aphasia: A clinical approach (2nd Edition ed.). Boston: College-Hill Press.). I'm sure it is a wonderful book, but probably not the best primary source for their claim. Reference #8 is to the Token Test (De Renzi, E., and Vignolo, L. (1962). The Token Test: a sensitive test to detect receptive disturbances in aphasics. Brain, 85, 665-678) which assesses comprehension of commands. The test involves a set of "tokens" of various sizes, colors, and shapes, and ranges from simple commands ("touch the yellow circle") to multi-clause, multi-step comments ("put the large black square on the small yellow circle"). This is obviously a very general measure that will pick up any number deficits ranging from auditory comprehension, to working memory, to executive function. It is not surprising that exclusively frontal lesions can lead to deficits on this task. More the point, the issue of sentence comprehension is completely orthogonal to the role of motor involvement in speech perception.

Furthermore, aphasic patients generally exhibit abnormalities in speech perception [9], especially a deficit in phoneme identification, in tasks such as the one used in our study [10].

Now we come back to question of task issues. There is no need to rehash the details of the arguments here – well maybe there is but I won’t – other than to say (again) that performance on phoneme identification and discrimination tasks double-dissociate from performance on auditory comprehension tasks even those comprehension tasks that require fine phonemic discriminations (i.e., they are minimal pairs, differing by a single feature). In short, it turns out that phoneme identification is a metalinguistic skill that doesn’t reflect normal speech perception. The fact that aphasics may exhibit abnormalities on phoneme identification tasks is invalid because the task is invalid. If you aren’t convinced of this, please read Hickok & Poeppel (2000, 2004, 2007) and this blog entry.

We should also stress that the hypothesis of perceptual relevance of motor systems requires precise experiments addressing this issue. However, Dr. Hickok refers to negative evidence that did not explicitly test perceptual relevance of motor centers, but rather are based on anecdotic reports or clinical tests at best

I referred to the observation that damage to M1, or to Broca’s area (bilaterally), or to large sectors of fronto-parietal cortex (severe Broca’s aphasia), or to the entire left hemisphere (my own Wada studies using minimal pair stimuli), or to individuals who failed to develop speech (anarthrias), or had not yet developed speech (infants), or who don’t have the capacity to develop speech (Chinchillas) do not prevent speech perception from occurring. If these studies are “anecdotic reports or clinical tests at best” then I stand corrected.

[see ref. 11 for an interesting demonstration of why Broca's aphasics usually show intact comprehension in standard clinical tests, although they are impaired in such ability].

Reference 11 is a very interesting study of the comprehension of acoustically distorted words; there were both low pass filtered and compressed in time by 50% (Moineau, S., Dronkers, N. F., Bates, E. (2005). Exploring the processing continuum of single-word comprehension in aphasia. J Speech Lang Hear Res, 48, 884-96). What they found was that (i) word comprehension was worse in distorted compared to non-distorted conditions for Broca’s aphasics, but also for Wernicke’s aphasics, anomic aphasics, right hemisphere damaged patients, and normal controls, but also that (ii) word comprehension was more affected by distortion in Broca’s and even more so in Wernicke’s aphasics than the other groups. This latter result indicates that damage to frontal or posterior left hemisphere regions impacts speech comprehension under non-optimal conditions. Given that the lesions associated with Broca’s aphasia tend to be large, it is difficult to attribute this effect to damage to primary motor cortex, premotor cortex, or Broca’s region, but for the sake of argument, let’s suppose it is. This still does not mean that speech perception is grounded in motor systems. First, the fact remains that under optimal listening conditions, comprehension performance among Broca’s aphasics did not differ statistically from normal controls (whereas Wernicke’s aphasics performance did). With the speech motor system largely out of the picture in Broca’s aphasia, something is supporting auditory comprehension. Presumably it is the temporal lobe(s). If speech perception were grounded in the motor speech system, one would expect even normal speech perception to be impaired following large lesions to this system, yet this is not the case. Rather, the finding that frontal lesions can exacerbate speech recognitions deficits under distorted listening conditions suggests that this tissue can modulate speech recognition processes to some degree, perhaps via motor prediction (forward models) or perhaps via attention, executive, or working memory systems.

what Dr. Hickok considers an index of preserved comprehension (80% of accuracy) is, in our view, a really relevant deficit

Eighty percent accuracy is indeed a significant deficit on a word recognition task. But as I pointed out, much of this deficit may not result from difficulty in speech sound perception but from higher-level dysfunction. Further, this performance level holds for non-fluent patients with effective zero speech production capacity. In this context, 80% accuracy far outstrips the ~0% motor speech performance.

although we consider patients studies as strongly informative on brain function, we should keep in mind the fact that it is often extremely difficult to generalize these data to situations not specifically tested by a given study.

So the fact that patients with lesions to the motor system or with no motor speech capacity can nonetheless comprehend speech is non-generalizable because “situations” were not specifically tested? This is hand-waving. What are these “situations”? And more importantly, how does one explain the preserved comprehension in the face of motor speech system damage?

Now it gets confusing:

Dr. Hickok proposes three alternative interpretations to explain our data, that we summarize as follows: 1. motor to sensory flow (activation of forward models); 2. existence of a "third" decision area gathering information from sensory and motor cortices; 3. TMS targeted attentional processes towards phonological features. The first explanation is actually our interpretation.

If this is the authors’ interpretation then indeed we have no argument. However, in the next paragraph we see this statement:

One may still want to claim, "The temporal lobe perceives speech while the motor system only helps." However, we think that this position stems from old-fashioned philosophies about the nature of brain areas as a modular input or output processors. As we point out in our paper, advances in the brain sciences in the last twenty years have taught us that neuronal assemblies encompass motor and perceptual "modules" of the brain and build distributed functional systems to which especially the motor system makes an eminent contribution [14].

They don’t seem to believe that the motor system only helps (my position). What do they mean then that their findings are explained by motor to sensory flow? Maybe I’m too old-fashioned to understand (see below for old-fashioned speculation). By the way, they are incorrect about the “old-fashioned” theories and it is not just the last 20 years that have taught us about sensory-motor relations. Wernicke noticed that posterior aphasics have speech production deficits – that’s right, production deficits resulting from damage to sensory cortex -- and explicitly proposed that sensory systems interact with (help guide) the motor system during speech acts. Wernicke was just as modern in this respect as say Pulvermuller (who’s model is functionally identical to Wernicke’s) except that the dynamic influence flowed most noticeably in the sensory to motor direction (sensory guides motor) rather than the motor emphasis of the “modern” theorists.

Thus, specific motor-perceptual channels seem to exist in the brain and these channels work by associating the acoustic property of, e.g., the speech sound /b/ with the motor representation of the articulatory gesture leading to the production of the same speech sound in the listener's motor brain. We see this finding very close to the Liberman's idea of motor perception and we felt ourselves obliged to recognize the intellectual merit of his intuition.

Liberman believed that the activation of motor speech systems WAS speech perception, not the mere association. Again, this is an interesting and thoughtful (but incorrect) hypothesis. But how close is “very close”? We need some clarification.

Distributed systems with a strongly linked action and perception subcomponents explain patterns of deficits in aphasia, especially dissociations between motor and perceptual impairments in case of lesion of the distributed neuronal assemblies at their acoustic or motor ends [15, 16].

Ok, wait. So motor and perceptual impairments do dissociate? This is what I was arguing! No fair switching sides! (I kind of feel like Daffy Duck arguing ‘Duck season – Wabbit season with Bugs Bunny!) Why couldn’t we just start with this admission and move on from there?

Ultimately, as a distributed circuit needs to receive sensory input and control motor output, cutting of these afferent and efferent connections does explain the occasionally observed unimodal deficits mentioned in Hickok's contribution.

Oh, maybe they mean the peripheral sensory and motor systems… Like the entire left hemisphere, for example, or Broca’s area bilaterally.

By no means do these dissociations prove the modular nature of the language system. Lesion evidence argues in favour of a distributed systems account [17]. In sum, we do not think that Hickok's proposal provide reasonable arguments for rejecting functional interactions between motor and language systems, speech perception systems included.

I’m not claiming the language system is modular, nor aim I rejecting the existence of functional interactions between motor and language (they probably meant sensory) systems. No fair switching arguments!
Here’s what I am guessing the authors believe (if you push hard enough to find out). Speech sounds are represented in distributed sensory-motor systems. Activation of the entire sensory-motor network = activation of a phonological representation. These distributed representations are then used for lexical look up. This is a reasonable hypothesis. However, this is not a motor theory of speech perception, nor is this a theory in which speech perception is “grounded” in motor circuits. If this is in fact what the authors believe, then it is misleading for them to place so much emphasis on the motor half of the equation. On the other hand, this work grows out of the mirror neuron literature where very explicit claims are made regarding the central role of the motor system in action understanding. So maybe they really do believe in a motor theory of speech perception.

I would love to hear from any of the authors on the paper so we can sort these issues out.


Anonymous said...

Dear Greg,

I just read your response to d'Ausilio and colleagues regarding a possible mediating role of the motor system in speech perception. As you rightly mention, one crucial point is that, in their elegant study (as well as in Meister et al.'s study), the phoneme identification task was performed in the presence of masking noise which reduced performance overall and therefore impacts on the interpretation of the results. It is therefore still unclear whether the motor system is functionally activated under 'normal' speech processing conditions and, if not, whether the motor system involvement is only functional in the presence of sensory challenge or is activated more generally when task demands (beyond increasing signal to noise) are increased.

Actually, I and colleagues performed an rTMS study, in which 600 1Hz low-frequency pulses were applied to the left superior ventral premotor cortex (same site as in Meister et al.'s study) and participants subsequently performed auditory speech tasks involving the same set of nonsense syllables (e.g., /pyd/, /pon/) without any masking noise but differing in the use of phonemic segmentation processes. Compared to sham stimulation, rTMS applied over the ventral premotor cortex resulted in slower phoneme discrimination requiring phonemic segmentation. No effect was observed in phoneme identification and syllable discrimination tasks that could be performed without need for phonemic segmentation.

We think that these results are fully consistent with the dual-stream model and the recruitment of a dorsal auditory-motor circuit, in which motor representations of speech are thought to be used strategically to assist in working memory and sub-lexical task performance (here phoneme segmentation). In a paper, now accepted in Brain and Language (see the reference below), we concluded that 'speech perception is best conceptualized as an interactive neural process involving reciprocal connections between sensory and motor areas whose connection strengths vary as a function of the perceptual task and the external environment. The observed disruptive rTMS effect suggests that the left svPMC plays a functional role in speech segmentation and is recruited with increased task demands under normal listening conditions.'

Nevertheless, as demonstrated by d'Ausilio and colleagues, the fact that the motor system specifically interacts with superior-temporal cortex in the speech perception process under adverse listening conditions is a very interesting result. This is in keeping with previous fmri studies and suggests that the motor system reacts to noise or novelty or mismatch by enhancing the auditory signal to resolve signal ambiguity.

Finally, one intriguing question, at the core of your debate regarding the involvement of the motor system and more generally the mirror neuron system in speech perception, is what could be the function of the motor activity observed during passive speech perception under normal listening conditions in single-pulse TMS and sparse-sampling fMRI studies? I think that one possibility is that the involvement of the motor system would not be strictly intrinsic to speech comprehension but may facilitate conversational exchange by contributing to a common perceptuo-motor framework between speakers. In that case, speech motor resonance may represent a dynamic sensorimotor adaptation under the influence of the other talker's speech patterns, and in return may facilitate conversational interactions through convergent behaviors. Actually, we discussed this possibility with Luciano Fadiga and Jean-Luc Schwartz in a review paper and it is also mentionned in a recent paper by Sophie Scott and colleagues.

Best regards,

Marc Sato

- Sato, M., Tremblay, P. & Gracco, V. (in press). A mediating role of the premotor cortex in phoneme segmentation. Brain and Language.
- Schwartz, J.-L., Sato, M. & Fadiga, L. (2008). The common language of speech perception and action: a neurocognitive perspective. Revue Française de Linguistique Appliquée, 13(2): 9-22.
- Scott, S.K., McGettigan, & C. Eisner, F. (in press). A little more conversation, a little less action--candidate roles for the motor cortex in speech perception. Nature Review Neuroscience.

Greg Hickok said...

Hi Marc,

Thanks for your thoughts. I certainly like your view on the role of frontal circuits in the face of increasing task demands. I also think it is plausible that the motor system might be a part of a mechanism to deal with noisy input as in the D'Ausilio et al. study. I like one aspect of the way you phrased its contribution, that the motor system operates "by enhancing the auditory signal to resolve signal ambiguity."

I'm confused and/or puzzled by other comments though. You mention that "the motor system reacts to noise or novelty..." I'm not convinced that the motor system reacts to anything! No one seems to be considering the possibility that under demanding conditions humans (and not just their motor system) use whatever resources are available to them. If I'm bored in a lecture that I really should be paying attention to, I force myself to "talk along" with the speaker (a kind of "motor simulation"?), to focus on lip movements, to use whatever semantic, pragmatic, or visual (e.g., ppt slides) context that I can find. This feels effortful and non-automatic. Isn't it possible that the mysterious "central executive," in an effort to deal with high load situations, is activating any and all relevant resources, one of them being some form of motor speech prediction, to stay focused and help disambiguate the signal? In other words, maybe it's not that "the motor system reacts" its that the human reacts and activates the motor system to help.

Is any of this making sense? Or is it way past my bedtime?

Brad Buchsbaum said...

Sorry for jumping in a little late, but I have a (perhaps obvious) observation. The debate seems to be coming down to what is meant by two words, "modulatory" and "grounded". (thought: maybe in Italian "grounded" really means "modulated by"?)

If someone says "X is grounded in Y" I think it's generally assumed to mean that the only thing X is grounded in is Y.

In other words if one makes the claim: "speech perception is grounded in motor circuits" it implies that the statement "speech is grounded in auditory cortex" is false, even if that was not the intended meaning of the remark. Something is rarely "grounded" in two things at once.

From the commentary it appears that Fadiga and colleagues don't really mean that speech perception is grounded in motor circuits after all, but rather that it is grounded in both auditory and motor systems, or that it is "distributed". If something is truly distributed, then it's hard to also make the claim that it is "grounded" somewhere.

So I'm left to conclude that "grounded" was perhaps not the right word for the concept that Fadiga and colleagues were trying to convey.

On the other hand, if the argument is that speech perception is not grounded anywhere, but rather is "distributed" (equally?) across auditory and motor cortices, then such a theory really ought to predict that damage to motor and sensory cortices should cause more or less equally severe deficits in speech perception.

Instead, lesions to auditory cortex can cause catastrophic deficits to speech perception -- i.e. "pure word deafness" whereas lesions to motor cortices result in rather mild deficits. From an anatomical standpoint then, there seems to be a very good prima facie case that, as Dr. Hickok has been arguing, speech perception is indeed "grounded in auditory circuits", but that motor circuits can play a "modulatory role".