There was a very interesting speech/language session at SfN this year organized by Jonathan Peelle. Talks included presentations Sophie Scott, Jonas Obleser, Sonia Kotz, Matt Davis and others spanning an impressive range of methods and perspectives on auditory language processing. Good stuff and a fun group of people. It felt kind of like a joint lab meeting with lots of discussion.
I want to emphasize one of the issues that came up, namely, the brain's response to intelligible speech and what we can learn from it. Here's a brief history.
2000 - Sophie Scott, Richard Wise and colleagues published a very influential paper which identified a left anterior temporal lobe region that responded more to intelligible speech (clear and noise vocoded sentences) than unintelligible speech (spectrally rotated versions of the intelligible speech stimuli). It was argued that this is the "pathway for intelligible speech".
2000 - Hickok & Poeppel published a critical review of the speech perception literature arguing, on the basis of primarily lesion data, that speech perception is bilaterally organized and implicates posterior superior temporal regions in speech sound perception.
2000-2006 - Several more papers from Scott/Wise's group replicated this basic finding but additional areas started creeping into the picture including left posterior regions and right hemisphere regions. The example figure below is from Sptsyna et al. 2006
2007 - Hickok & Poeppel again reviewed the broader literature on speech perception including lesion work as well as studies that attempted to isolate phonological-level processes more specifically. It is concluded, yes you guessed it, that Hickok & Poeppel 2000 were pretty much correct their claim of a bilaterally organized posterior temporal speech perception system.
2009 - Rauschecker and Scott publish their "Maps and Streams" review paper arguing just as strongly that speech perception is left lateralized and is dependent on an anterior pathway. As far as I can tell, this claim is based on (i) analogy to the ventral stream pathway projection in monkeys (note: we might not yet fully understand the primate auditory system and given that monkeys don't have speech, the homologies may be less than perfect), and (ii) the fact that the peak activation in intelligible minus unintelligible sentences tends to be greatest in the left anterior temporal lobe.
2010 - Okada et al. publish a replication of Scott et al. 2000 using a much larger sample than any previous study (n=20 compared to n=8 in the Scott et al. 2000) and find robust bilateral anterior and posterior activations in the superior temporal lobe for intelligible compared to unintelligible speech. See figure below which shows the group activation (top) and peak activations in individual subjects (bottom). Note that even though it doesn't show up in the group analysis, activation extends to right posterior STG/STS in most subjects.
So that's the history. As was revealed at the SfN session controversy still remains, despite the existence of what I thought was fairly compelling evidence against an exclusively anterior-going projection pathway.
Here's what came out at the conference.
I presented lesion evidence collected with my collaborators Corianne Rogalsky, Hanna Damasio, and Steven Anderson, which showed that destruction of the left anterior temporal lobe "intelligibility area" has zero effect on speech perception (see figure below). This example patient performed with 100% accuracy on a test of auditory word comprehension (4AFC, word to picture matching with all phonemic foils, including minimal pairs), and 98% accuracy on a minimal pair syllable discrimination test. Combine this with the fact that auditory comprehension deficits are most strongly associated with lesions in the posterior MTG (Bates et al. 2003) and this adds up to a major problem for the Scott et al. theory.
The counter-argument from the Scott camp was addressed exclusively at the imaging data. I'll try to summarize their main points as accurately as possible. Someone correct me if I've got them wrong.
1. Left ATL is the peak activation in intelligible vs. unintelligible contrasts
2. Okada et al. did not use sparse sampling acquisition (true) which increased the intelligibility processing load (possible) thus recruiting posterior and right hemisphere involvement
3. Okada et al. used an "active task" which affected the activation pattern (we asked subjects to press a button indicating whether the sentence was intelligible or not).
First and most importantly, none of these counter-arguments provides an account of the lesion data. We have to look at all sources of data in building our theories.
Regarding point #2: I will admit that it is possible that the extra noise taxed the system more than normal and this could have increased the signal throughout the network. However, these same regions are showing up in the reports of Scott and colleagues, even in the PET scans, and the regions that are showing up (bilateral pSTG/STS) are the same as those implicated in lesion work and in imaging studies that target phonological level processes.
Regarding point #3: I'm all for paying close attention to the task in explaining (or explaining away) activation patterns. However, if the task directly assesses the behavior of interest (which is not the case in many studies), this argument doesn't hold. The goal of all this work is to map the network for processing intelligible speech. If we are asking subjects to tell us if the sentence is intelligible, this should drive the network of interest. Unless, I suppose, you think that the pSTG is involved decision processes which is highly dubious.
This brings us to point #1: Yes, it does appear that the peak activation in the intell vs. unintell contrast is in the left anterior temporal lobe. This tendency is what drives the Scott et al. theory. But why the obsession with this contrast? There are two primary reasons why we shouldn't be obsessed with it. In fact, these points question whether there is any usefulness to the contrast at all.
1. It's confounded. Intelligible speech differs from unintelligible speech on a host of dimensions: phonemic, lexical, semantic, syntactic, prosodic, and compositional semantic content. Further, the various intelligibility conditions are acoustically different, just listen to them, or note that A1 can reliably classify each condition from the other (Okada et al. 2010). It is therefore extremely unclear what the contrast is isolating.
2. By performing this contrast, one is assuming that any region that fails to show a difference between the conditions is not part of the pathway for intelligible speech. This is clearly an incorrect assumption: in the extreme case, peripheral hearing loss impairs the ability understand speech even though the peripheral auditory system does not respond exclusively to intelligible speech. Closer to the point, even if it was the case that the left pSTG/STS did not show an activation difference between intelligible and unintelligible speech it could still be THE region responsible for speech perception. In fact, if the job of a speech perception network is to take spectrotemporal patterns as input and map these onto stored representations of speech sound categories, one would expect activation of this network across a range of spectrotemporal patterns, not only those that are "intelligible".
I don't expect this debate to end soon. In fact, one suggestion for the next "debate" at the NLC conference is Scott vs. Poeppel. That would be fun.
Bates, E., Wilson, S.M., Saygin, A.P., Dick, F., Sereno, M.I., Knight, R.T., and Dronkers, N.F. (2003). Voxel-based lesion-symptom mapping. Nat Neurosci 6, 448-450.
Hickok, G., and Poeppel, D. (2000). Towards a functional neuroanatomy of speech perception. Trends in Cognitive Sciences 4, 131-138.
Hickok, G., and Poeppel, D. (2007). The cortical organization of speech processing. Nat Rev Neurosci 8, 393-402.
Okada K, Rong F, Venezia J, Matchin W, Hsieh IH, Saberi K, Serences JT, & Hickok G (2010). Hierarchical organization of human auditory cortex: evidence from acoustic invariance in the response to intelligible speech. Cerebral cortex (New York, N.Y. : 1991), 20 (10), 2486-95 PMID: 20100898
Narain, C., Scott, S.K., Wise, R.J., Rosen, S., Leff, A., Iversen, S.D., and Matthews, P.M. (2003). Defining a left-lateralized response specific to intelligible speech using fMRI. Cereb Cortex 13, 1362-1368.
Rauschecker, J.P., and Scott, S.K. (2009). Maps and streams in the auditory cortex: nonhuman primates illuminate human speech processing. Nat Neurosci 12, 718-724.
Scott, S.K., Blank, C.C., Rosen, S., and Wise, R.J.S. (2000). Identification of a pathway for intelligible speech in the left temporal lobe. Brain 123, 2400-2406.
Spitsyna, G., Warren, J.E., Scott, S.K., Turkheimer, F.E., and Wise, R.J. (2006). Converging language streams in the human temporal lobe. J Neurosci 26, 7328-7336.