Thursday, April 23, 2009

Motor influence of speech perception: The view from Grenoble

We had a nice visit with Grenoble's own Jean-Luc Schwartz at TB West this past week. Jean-Luc has been working on motor influences on speech perception for years (decades even) and has a very thoughtful and empirically solid perspective on the issue. Here is the structure of one argument that I found particularly interesting and compelling (Jean-Luc, I'm going to steal a couple of our ppt file images; I'm hoping your won't mind. And please correct my errors in summarizing your points!):

1. A classic problem in speech perception research, and one that led to the development of the motor theory, is the acoustic variability associated with individual phonemes caused by coarticulation: the /d/ in /di/ and /du/ have different acoustic signatures (but the same articulation).

2. However, as J-L noted, if you look in acoustic space for the whole syllable (e.g., in a plot of F1 vs. F2, I believe) one can capture the distinction between /di/ and /du/ quite nicely. In other words, you can solve the lack of invariance problem acoustically just by widening your temporal window from segment to syllable.

3. However' -- and here is J-L's motor influence argument -- there is no reason why we should hear /di/ and /du/ as containing the same onset sound. If it were all just auditory, why wouldn't we just hear, /di/ /da/ /du/ and /bi/ /ba/ /bu/ as six different acoustic categories instead of the two onset categories indicated in the figure below? Answer: the categorization comes from the way the phonemes are articulated, not from their acoustic consequences.

He also presented similarly structured arguments (i.e., that generalizations can be made over motor but not perceptual systems) using data from the distribution of vowels in the world's languages and from perceptual tendencies in the verbal transformation effect.

Jean-Luc is not arguing here for a hardcore motor theory. In fact, he argues that a pure motor theory is indefensible. Rather, the claim is that acoustic categories are modified by the motor system. I think this is a perfectly reasonable conclusion, and one that is consistent with my basic position -- that access to the lexicon is from auditory-phonological systems. One issue I did raise however, is that while it seems clear that phonological categories (phonemes) are influenced by motor systems, there really is not any evidence that this information actually modifies perceptual categories. For example, maybe in our perceptual system all we really have is six different categories for di da du bi ba bu? It is only when you need to map these sounds onto articulatory gestures that the system needs to pick up on the fact that there are commonalities between the first three vs. the last three.

You might want to argue that this can't be right because we obviously hear di da du as all starting with /d/. But I'm not so sure. I think this may be a consequence of the fact that we have been taught, for the purpose of learning to read, that words are composed of individual phonemes. Again, I think it is critical to remember that when we listen to speech under ecologically valid conditions, we don't hear speech sounds, we hear words (i.e., meanings).

Here's a few recent papers by Jean-Luc and colleagues. Mark Sato, who has contributed to this blog, is among these colleagues, by the way. These folks are doing some really good work and definitely worth following.

Sato M, Schwartz JL, Abry C, Cathiard MA, Loevenbruck H. Multistable syllables as enacted percepts: a source of an asymmetric bias in the verbal transformation effect. Percept Psychophys. 2006 Apr;68(3):458-74.

Ménard L, Schwartz JL, Boë LJ. Role of vocal tract morphology in speech
development: perceptual targets and sensorimotor maps for synthesized French vowels from birth to adulthood. J Speech Lang Hear Res. 2004 Oct;47(5):1059-80.

Sato M, Baciu M, Loevenbruck H, Schwartz JL, Cathiard MA, Segebarth C, Abry C. Multistable representation of speech forms: a functional MRI study of verbal transformations. Neuroimage. 2004 Nov;23(3):1143-51.

Rochet-Capellan A, Schwartz JL. An articulatory basis for the
labial-to-coronal effect: /pata/ seems a more stable articulatory pattern than /tapa/. J Acoust Soc Am. 2007 Jun;121(6):3740-54.

Sato M, Vallée N, Schwartz JL, Rousset I. A perceptual correlate of the
labial-coronal effect. J Speech Lang Hear Res. 2007 Dec;50(6):1466-80.

Sato M, Basirat A, Schwartz JL. Visual contribution to the multistable
perception of speech. Percept Psychophys. 2007 Nov;69(8):1360-72.

Basirat A, Sato M, Schwartz JL, Kahane P, Lachaux JP. Parieto-frontal gamma band activity during the perceptual emergence of speech forms. Neuroimage. 2008 Aug 1;42(1):404-13. Epub 2008 Apr 16.

Tuesday, April 21, 2009

Einstein's brain: anomalous auditory/language dorsal stream

A forthcoming paper in Frontiers in Evolutionary Neuroscience by Dean Falk shows that Albert Einstein's brain had some rare anatomical anomalies involving language-related sensory-motor areas, regions I consider to be part of the auditory "dorsal stream" -- or more accurately, the vocal-tract sensory-motor integration circuit (Hickok & Poeppel, 2007; Pa & Hickok, 2008). Falk suggests that these anomalies may be related to Einstein's reported delay in language development as well as to his self-reported tendency to use visual imagery over audio-verbal imagery.

Falk analyzed gross anatomical features of Einstein's brain from photographs. He reported atypicalities in the pre- and post-central gyrus region, but most striking to my eye is the tangled mess of Einstein's posterior Sylvian region. His post central sulcus (Pti) extends into the Sylvian fissure at which point the Sylvian seems to just end. It would seem that this arrangement destroys the typical configuration of the planum temporale, parietal operculum, and the entire supramarginal gyrus.

In fact, Falk suggest that Einstein's BA40 (the supramarginal gyrus) is split into two parts as indicated on the rendering from Falk's paper where A is the typical arrangement and B is Einstein's.

It is not clear at all where sensory-motor area Spt (Hickok et al. 2009) might be in this pattern. We have suggested that area Spt, as part of a sensory-motor integration circuit, is critical for speech development (Hickok & Poeppel, 2007), but there is little direct evidence on this point. I'm not sure I would count Einstein's anomalies in this region and associated language delay as evidence, but it is an interesting observation.


Falk D (2009) New Information about Albert Einstein's Brain. Front. Evol. Neurosci. doi:10.3889/neuro.18.003.2009

Hickok, G., Okada, K., & Serences, J. (2008). Area Spt in the Human Planum Temporale Supports Sensory-Motor Integration for Speech Processing Journal of Neurophysiology, 101 (5), 2725-2732 DOI: 10.1152/jn.91099.2008

Hickok, G., & Poeppel, D. (2007). The cortical organization of speech processing Nature Reviews Neuroscience, 8 (5), 393-402 DOI: 10.1038/nrn2113

PA, J., & HICKOK, G. (2008). A parietal–temporal sensory–motor integration area for the human vocal tract: Evidence from an fMRI study of skilled musicians Neuropsychologia, 46 (1), 362-368 DOI: 10.1016/j.neuropsychologia.2007.06.024

Monday, April 20, 2009

Post Doc at U Penn

Post-doctoral Position Available: The University of Pennsylvania

Applications are being accepted for post-doctoral position working on the
neural basis of spatial cognition and communication. Post-doctoral fellows
will conduct functional neuroimaging and/or patient-based research and will
be involved in activities of the Center for Cognitive Neuroscience.
Experience with methods in functional neuroimaging and background in spatial
cognition or language research is highly desirable. Please send a statement
of interest, CV and 2 letters of reference to: Anjan Chatterjee
(, Department of Neurology, 3 West Gates, 3400
Spruce Street, Philadelphia, PA 19104, (215) 662-4265.

Thursday, April 16, 2009

Broca's area: It's a dessert topping! No it's a floor wax! No it's a cognitive control mechanism!

Debates over the function of Broca's area remind me of the old Saturday Night Live skit where a husband (Dan Aykroyd) and wife (Gilda Radner) are arguing about whether a product, "New Shimmer" is a dessert topping or a floor wax:

Wife: New Shimmer is a floor wax!
Husband: No, new Shimmer is a dessert topping!
Wife: It's a floor wax!
Husband: It's a dessert topping!
Wife: It's a floor wax, I'm telling you!

The spokesman (Chevy Chase) quickly enters at this point and says:
Hey, hey, hey, calm down, you two. New Shimmer is both a floor wax and a dessert topping!

With respect to Broca's area we're in the middle of an even more complicated argument.
Grodzinsky: It's syntactic movement!
Friederici: It's a hierarchical structure processor!
Rogalsky/Hickok: It's articulatory rehearsal (at least in the back part)!
Rizzolatti/Fadiga: It's action understanding!

To this argument we can add another view; one that doesn't get talked about as much.
Novick, Trueswell, Thompson-Schill: It's a cognitive control mechanism!

This is an interesting claim and I wonder to what extent "cognitive control" may be our Chevy Chase, telling us that Broca's area can do more than one thing.
Of course, I think at least some of these territorial claims to Broca's area will be resolved simply by mapping these various functions (within subjects) to the different subregions that comprise the foot of the third frontal convolution. But we won't worry about those details at this point. For now let's just see what Novick et al. suggest.

First, Novick et al. are addressing the claim that Broca's area is specifically involved in syntactic computation or in the temporary storage of syntactic information. They are not trying to address the many other claims. This is fine, but eventually we'll have to deal with the full range of data.

Now, what do they mean by cognitive control. Well, basically it is a mechanism for conflict resolution. Ok, what's that? The define conflict as:
...cases in which an individual receives incompatible information either about how best to characterize a stimulus or how best to respond to that stimulus. p. 265

They mention the Stroop task as a classic example. So cognitive control in this context would be the process of shifting attention toward task-relevant stimulus characteristics in order to override automatically generated but currently irrelevant representations (paraphrased from page 265).
What's the link to Broca's area? Well, incongruent trials on Stroop tasks activate Broca's area as does the processing or gardenpath sentences which requires resolution of syntactic ambiguity (conflict). Lesion and individual difference data are also presented along these lines. A subsequent empirical study by this group found that Stroop tasks and syntactic ambiguity resolution co-localize in Broca's area (January, Trueswell, & Thompson-Schill, in press, Journal of Cognitive Neuroscience).

I don't think necessarily that this hypothesis is going to solve the problem of what Broca's area is doing -- it's not going to be that simple. For example, it doesn't explain why portions of Broca's area activates during simple articulatory rehearsal. But there is enough evidence that this sort of claim needs to be included in the discussion.
Importantly, mechanisms such as this need to be considered when discussing the role of "motor areas" -- Broca's being a centerpiece in the mirror neuron "motor system" -- in speech perception. I suggested in my critique of D'Ausilio et al.'s paper that motor speech systems may influence perception in the following way:

motor and perceptual information may converge on higher-order executive processes where this information is used to color decision-making processes

This may be particularly relevant in situations where the speech stimuli are ambiguous, as in the noise degraded stimuli of D'Ausilio et al.
They responded to this suggestion by saying that this...
interpretation reminds [us of] 18-19th century models of the human mind, in the sense that requires an additional functional module

I'm no expert but my guess is that folks who study decision making for a living might argue (convincingly) that such as system is needed on independent grounds. In any case, the point is that "Broca's area" may be involved in certain functions that could be considered "executive" and one shouldn't be to hasty to ascribe a motor explanation to everything that happens in Broca's area.

NOVICK, J., TRUESWELL, J., & THOMPSON-SCHILL, S. (2005). Cognitive control and parsing: Reexamining the role of Broca's area in sentence comprehension Cognitive, Affective, & Behavioral Neuroscience, 5 (3), 263-281 DOI: 10.3758/CABN.5.3.263

Wednesday, April 15, 2009

Fadiga vs. Hickok

It looks like one session at the SfN satellite meeting, NCL2009, will feature Luciano Fadiga and yours truly each making their case for the neural basis of speech perception followed by lots of discussion. Should be fun. Hope to see you there.

NCL2009 Announcement below.

Dear Colleagues,

We are delighted to announce that the Call for Abstracts for the first Neurobiology of Language Conference (NLC2009), to be held in Chicago, on October 15-16 2009, is now open! Please note that the deadline to submit an abstract is May 17 2009.

NLC2009 will be held as a satellite event of the 39th annual meeting of the Society for Neuroscience (SfN). Please note that SfN regulations allow individuals to present their SfN abstracts during SfN satellite events. Also note that it is not necessary to be a member of the SfN to attend NLC2009.

For more information on the Conference, or to submit an abstract, please visit our website at The website will be updated continually as information becomes available. Watch for an email announcing the opening of the registration site in May!

Shall you have questions regarding the abstract submission and/or the evaluation processes, please send an email at

We look forward to seeing you in Chicago!


Pascale Tremblay, Ph.D., Postdoctoral Scholar, The University of Chicago
Steven L. Small, Ph.D., M.D., Professor, The University of Chicago

NLC2009 Organizing Committee:
Jeffrey Binder, M.D., Medical College of Wisconsin, USA
Sheila Blumstein, Ph.D., Brown University, USA
Laurent Cohen, M.D., Hôpital de la Salpêtrière, France
Angela Friederici, Ph.D., Max Planck Institute, Germany
Vincent Gracco, Ph.D., McGill University, Canada
Peter Hagoort, Ph.D., Max Planck Institute, Netherlands
Marta Kutas, Ph.D., The University of California, San Diego, USA
Alec Marantz, Ph.D., New York University, USA
David Poeppel, Ph.D., New York University, USA
Cathy Price, Ph.D., University College London, UK
Kunioshi Sakai, Ph.D., Tokyo University, Japan
Riitta Salmelin, Ph.D., Helsinki University, Finland
Bradley Schlagger, M.D., Ph.D., Washington University, USA
Richard Wise, M.D., Ph.D, Imperial College, London, UK

Monday, April 6, 2009

Speech: Not enough distinctions are being made

Not enough distinctions are being made. For better or for worse -- and probably for worse -- let me reiterate a few points that have been raised, because they point to the need for much greater 'conceptual hygiene.' I forget who keeps using the "not enough distinctions" phrase, sounds like the philosopher Jerry Fodor, but I think this point is critical in our current back and forth. Not enough distinctions are being made. Consequently, the discussion that is ongoing about speech is not sufficiently granular.

1. The 'moving parts' (or atoms, or lego blocks, or primitives, or whatever) at the basis of spoken language processing -- both from the input and output sides -- are of course more complex, and larger in number, than we ever discuss here, and consequently the discussion could get hijacked by underspecified concepts. And sometimes is ...

For example ... When we are discussing the so-called motor aspects -- which ones?? To pick up on yesterday's posts, in the Liberman revised motor theory, the objects of perception are intended articulatory gestures. As was rightly pointed out, how close to actual motor output are such objects? That is itself a topic of inquiry, and a complicated one at that. The motor system is not monolithic, and it matters a great deal whether we are working on neuronal populations that form the immediate substrate of motor output or populations that are richly connected to sensory areas but are distal to, say, M1 neurons. Incidentally, the literature on eye movements is worth looking to for some inspiration in this regard. More on that eventually.

Similarly, we should, I think, be very very careful about distinguishing forward models (that rely on a strong predictive element) from the motor generation of output. A forward model is associated with output -- but is not the same as the motor program that generates the output. And, crucially, a forward model does not have to be instantiated in motor cortex. That's an entirel different question as again was pointed out.

If we think of the phrase "motor" as referring to the neural circuitry that underlies output generation (i.e. the part of motor cortex that is required or speaking), then a motor theory is, I think, wrong, and if Luciano Fadiga (Hi Luciano -- thanks for participating!) is intending this view of a motor theory then it won't work.

2. Greg keeps harping on this, and let me also emphasize: what you use as a task matters a great deal. There is, from a perceptual, computational, and neurophysiological point of view a huge difference between, say, syllable discrimination in an experimental task setting, on the one hand, and comprehending spoken sentences, on the other. There are OBVIOUSLY some overlapping component processes, but can we please please please move on from this point? This issue has been rehearsed and discussed since the late 1990s ...

All else being equal, I am persuaded by an auditory view of speech perception, in which internal forward models (but not motor output models) play an important role (for example incorporating algorithms such as analysis-by-synthesis). I am happy to see and admit to a modulatory role of the type demonstrated by Luciano Fadiga and colleagues, but that activity is not epistemologically prior or causally necessary.

Sunday, April 5, 2009

Neural Models of Speech Recognition

There still seems to be some confusion about what is exactly being claimed by various neuro-theorists regarding the functional architecture of speech recognition. This goes for the Dual Stream model as well. I just got back from a great visit to the University of Chicago where I had the opportunity to spend a lot of time talking to Steve Small, Howard Nusbaum, and Matt Goldrick (who came down from Northwestern to hang out for awhile). We had some great discussions and I learned a lot. One issue that came up in these discussions was that it is not clear what everyone's position is on how speech recognition happens, particular in regard to the relative role of the sensory and motor systems. So here is my attempt to clarify this.

There are at least three types of models out there: 1. auditory models, 2. motor models, and 3. sensory-motor models.

Here's my simplified cartoon of an auditory model:

This is closest to my view. The access route from sound input to the conceptual system does not flow through the motor system although the motor system can modulate activity in the sensory system.

Here's a cartoon of a motor theory:

Something like this has been promoted by Liberman in the form of the Motor Theory of speech perception, as well as by Fadiga. One comment I'm getting a lot lately (including from Luciano) is that no one really believes in the motor theory. So here's a quote from the Fadiga & Craighero, Cortex, (2006) 42, 486-490:

According to Liberman’s theory … the listener understands the speaker when his/her articulatory gestures representations are activated by the listening to verbal sounds. p. 487

Liberman’s intuition … that the ultimate constituents of speech are not sounds but articulatory gestures that have evolved exclusively at the service of language, seems to us a good way to consider speech processing in the more general context of action recognition. p. 489

On this view, the route from acoustic speech input to the conceptual system flows through the motor system.

Here is my cartoon of a sensory-motor model:

This seems to be what Fadiga has in mind based on his comments on this blog, namely that it is in the "matching" of the sensory and motor systems that is critical for recognition to happen.

A Brad Buchsbaum pointed out, both a motor theory and a sensory-motor theory would predict that damage to the motor-speech system should produce substantial deficits in speech recognition. As this prediction doesn't hold up empirically, these theories in their strong forms are wrong.