The first of the trio of theoretical papers by me and David (Hickok & Poeppel
2000 -- the others being published in 2004, 2007) was originally submitted as a Mini Review to Neuron, where it was summarily rejected. That first version, that we affectionately refer to as Hickok & Poeppel (rejected), did not lay out the Dual Stream model. Instead it simply tried to make a case for bilateral speech perception, following up on David's arguments in his dissertation by adding new information that I had come across on conduction aphasia and the effects of left STG lesions (see
this paper for some details on those arguments). After talking with an editor at TICS we revised the manuscript and sent it to TICS in late 1999. This new manuscript introduced the notion of two task-dependent interfaces for acoustic speech information, i.e., the Dual Stream model, with the dorsal stream function being driven by some of my fMRI findings I was working on at the time, which showed auditory cortex activation in speech production (
Hickok et al. 2000). The task-dependent involvement of the two streams provided an explanation for why speech perception looked bilateral if you examined auditory comprehension tasks but left-dominant if you examined tasks like syllable discrimination (a major concern of one reviewer of our Neuron submission).
The original submission to TICS was rather different than the final published version. We had focused it around 3 hypotheses concerning the neural basis of speech perception. Reviewers were not impressed, but luckily the editor allowed us to attempt a revision. Below is the original submission to TICS that was rejected. Maybe, I'll post the reviewer comments and my response if I can dig it up.
-Greg
***********************************************
Towards a Functional Neuroanatomy of Speech Perception
Gregory Hickok*
Department of Cognitive Sciences
University of California, Irvine
David Poeppel
Department of Linguistics and Department of Biology
University of Maryland, College Park
*Corresponding author
Department of Cognitive Sciences
University of California
Irvine, CA 92697
949-824-1409
949-824-2307 (fax)
gshickok@uci.edu
SUMMARY
The functional neuroanatomy of speech perception has been difficult to characterize. Part of the difficulty, we suggest, stems from the fact that the neural systems supporting "speech perception" vary as a function of task. Specifically, the set of cognitive and neural systems involved in performing traditional laboratory speech perception tasks, such as discrimination or identification, are not necessarily the same as those involved in speech perception as it occurs during natural language comprehension. In this review, we argue that cortical fields in the posterior superior temporal lobe, bilaterally, constitute the primary substrate for constructing sound-based representations of speech, and that these sound-based representations interface with different supramodal systems in a task dependent manner. Tasks which require access to the mental lexicon (i.e., accessing meaning-based representations) rely on auditory-to-meaning interface systems in cortex in the vicinity of the left temporal-parietal-occipital junction; tasks which require explicit access to speech segments rely on auditory-motor interface systems in the left frontal and parietal lobes. The left frontal-parietal auditory-motor interface system also plays an important role in phonological working memory.
INTRODUCTION
A priori, one might imagine that identifying the cortical fields which support speech perception would be a relatively straightforward task. After all, speech perception is tightly coupled to a single sensory modality, unlike higher-level lexical or syntactic processes in language, for example, and therefore should have a relatively constrained neural organization. But it has been a century and a quarter since the first hypothesis concerning the neurology of speech perception was put forward1, and still there is no consensus. Early authors proposed that auditory cortical fields in the left superior temporal lobe comprise the primary substrate of speech perception1,2, others have pointed to the left inferior parietal lobe3, or have argued that the left inferior frontal cortex is a critical component of the system4, still others have emphasized the role of auditory cortices bilaterally5,6.
Part of this confusion stems from differences in what one means by “speech perception,” how one tests it behaviorally, and what methods are employed (e.g., lesion studies, functional imaging, and so on). Psychological research on speech perception typically utilizes tasks that involve the identification and/or discrimination of sub-lexical segments of speech, such as meaningless syllables, and many neuropsychological and functional imaging studies have borrowed from this rich literature. The tacit assumption in this work, of course, is that these task tap the same set of processes involved in the perception of speech sounds embedded in “normal” (e.g., conversational) speech. Whether this assumption is valid is an empirical issue, and in fact, we will argue on the basis of neuropsychological and neurophysiological data, that it is not a fully valid assumption.
For this reason, we will use the term “speech perception” to refer to the process by which acoustic input is coded into neural representations suitable for making contact with the mental lexicon7. Notice, crucially, that this definition does not necessarily include those processes involved in extracting, for conscious access, sub-lexical information from speech. Also notice that this definition is agnostic with respect to whether speech perception processes constitute a speech-specific module8 or whether they are comprised of domain general auditory processing systems. We will use the term “speech-sound perception” to refer to the process(es) involved in extraction (and manipulation) of sub-lexical speech segments.
Figure 1 illustrates the anatomy to which we will refer. Note that we will use the term STG (superior temporal gyrus) to refer to the lateral (exposed) portion of that gyrus, and the terms pSTP (posterior supratemporal plane), aSTP (anterior supratemporal plane), and Heschl’s gyrus to refer to various regions of temporal cortex buried inside the Sylvian fossa. The term “superior temporal lobe” will be used generically to refer to these structures as a whole.
-----------------------
Figure 1 about here
-----------------------
THREE HYPOTHESES CONCERNING THE FUNCTIONAL LOCALIZATION OF SPEECH PERCEPTION
In this section we consider three hypotheses concerning the functional neuroanatomy of speech perception: (1) the left posterior superior temporal lobe is the only substrate for speech perception, (2) other left perisylvian regions make a significant contribution to speech perception, and (3) both left and right posterior superior temporal cortex participate in speech perception.
Hypothesis 1: Left Posterior Superior Temporal Lobe is the Only Substrate for Speech Perception
As noted above, early proposals concerning the neural basis of speech perception held that auditory fields in the left posterior superior temporal lobe comprised the critical substrate for speech perception. This claim was driven primarily by clinical observations of patients with Wernicke’s aphasia who have substantial auditory language comprehension deficits (at the level of words and sentences), and who frequently have brain lesions involving this structure: it was thought that an inability to correctly perceive speech sounds was the underlying source of the language comprehension deficits1,2.
Subsequent neuropsychological investigations by Blumstein and colleagues did not confirm this hypothesis, however. Two major findings came out of their investigations. The first was that performance on speech-sound perception tasks (CV discrimination/identification) do not predict the degree of auditory language comprehension deficit as measured by standard clinical diagnostic tests9 . In fact, in one study, the patient with the best score on the auditory comprehension subtest of the Boston Diagnostic Aphasia Examination BDAE10, a Broca’s aphasic, could not reliably label or discriminate the CV stimuli; and the patient with the worst outcome on the BDAE auditory comprehension subtest, a Wernicke’s aphasic, performed normally on both of the CV perception tasks. The second major finding came out of a study11 which addressed the question of the relation between speech perception and auditory language comprehension more directly. In this study, a group of Wernicke’s aphasics were asked to identify an auditorily presented word by pointing to a corresponding picture out of an array of four pictures. The array of pictures included the target (e.g., a bear), a phonological foil (pear), a semantic foil (wolf), and an unrelated foil (grapes). The question was, would the patients with auditory comprehension deficits (the Wernicke’s aphasics) make more sound-based errors than non-sound-based errors. The answer was that most of the Wernicke’s aphasics’ errors were semantically-based, not sound-based (although the did make more sound-based errors than “unrelated errors”). Overall, Wernicke’s aphasics accessed the correct phonological representation of the target more than 80 percent of the time (the sum of the correct and semantically-related responses).
What is clear from these studies is that in contrast to early claims, auditory comprehension deficits in Wernicke’s aphasia cannot be attributed solely to speech perception deficits. The fact that Wernicke’s aphasics did make a relatively small but significant number of sound-based errors in the Baker et al. study suggests at most a mild impairment in the ability to perceive speech sounds. On the basis of these kinds of results, several authors have come to the conclusion that the auditory comprehension deficit in Wernicke’s aphasia stems not primarily from a speech-sound perception problem per se, but rather from difficulties in the mapping between phonological representations and conceptual representations7,11,12. Indeed, in their review of auditory comprehension deficits in aphasia, Bachman and Albert12 conclude “that, although deficits in phonemic discrimination can be detected in some aphasic subjects, disordered phonemic perception does not appear to be the principle underlying deficit in impaired auditory comprehension.” (p. 285).
What can these results tell us about the functional localization of speech perception? Given (i) that Wernicke’s aphasia is not associated with substantial deficits in the ability to perceive speech sounds (no matter how you measure it), and (ii) that the lesion associated with Wernicke’s aphasia is most often a left hemisphere lesion which includes the posterior STG/pSTP (the classical definition of Wernicke’s area), but which also typically extends inferiorly into the MTG and posteriorly into the angular gyrus and SMG13-15, we can conclude that even relatively large posterior temporo-parietal lesions do not substantially interrupt speech perception abilities. Does this imply that temporo-parietal cortical fields in the left hemisphere are irrelevant to speech perception? Certainly not. First, the fact that Wernicke’s aphasics do in fact make some sound-based errors in auditory comprehension suggests that speech perception is mildly impaired in Wernicke’s aphasia. Second, it is possible that these cortices do, in fact, contain the primary substrate for speech perception systems, but that other areas can compensate when these areas are damaged. A third possibility is that speech perception systems are not as focally organized as one might think a priori; if there is a more widely distributed network (including perhaps frontal, parietal, and even right hemisphere systems) focal damage may yield only mild impairment. Let us tentatively conclude then, that somewhere in the distribution of the lesion associated with Wernicke’s aphasia there exists systems involved in speech perception (explaining the mild deficits that are apparent), but that those cortical fields are not the only regions involved in, or capable of supporting, speech perception. Additionally, the fact that Wernicke’s aphasics have severe auditory language comprehension deficits which cannot be reduced to speech perception difficulties suggests that the regions in question also contain systems important for performing a mapping between sound and meaning7,11,16, and perhaps other, higher order linguistic processes.
Hypothesis 2: Left SMG and Broca’s Area Participate in Speech Perception
One possible explanation for why one doesn’t see severe speech perception deficits in Wernicke’s aphasia, is that there are other areas in the left hemisphere which are part of a network involved in speech perception and which can compensate when the left superior temporal lobe is damaged. There are two candidate regions for this kind of extra-auditory participation in speech perception: the left SMG and Broca’s area.
The left supramarginal gyrus (SMG) has been implicated in a recent lesion study of patients with “acoustic-phonetic” deficits3. In that study, patients were selected for inclusion on the basis of relatively poor performance on a pre-test involving phoneme discrimination. Once selected, patients were tested on syllable discrimination and identification tasks using both natural and synthetic stimuli. Performance on these tasks was related to the amount of damage in 46 different left hemisphere brain regions, confirmed with MRI. Two main findings emerged. One was that the overall extent of lesion in left perisylvian cortex did not predict performance on the speech-sound perception tasks. The second was that the extent of damage to two regions, the parietal operculum and the posterior segment of the SMG was related to overall performance on the speech-sound perception tasks. Further, eight of the ten patients in the study had lesions that included these areas. Interestingly, the two who did not, had lesions which involved Broca’s area. The authors cite conclude that the left posterior SMG and parietal operculum constitute “a principle site of phonemic processing in speech perception” (p. 298) although they also point out that other areas are likely involved.
We agree that the inferior parietal lobe is a site which may play a role in “phonemic processing” as it is measured using syllable discrimination and identification tasks, but we suggest that it plays little or no role in “phonemic processing” as it is applies to normal language comprehension. Two lines of evidence support our claim. The first is that lesions to the left inferior parietal lobe are not reliably associated with auditory comprehension deficits. Lesions involving the left SMG but sparing the left superior temporal lobe, are known to be associated with conduction aphasia17,18, a syndrome in which auditory comprehension is near normal. Notably, this seems to hold true even in cases where the lesion involves both the SMG and auditory cortices in the pSTP19. Thus, damage to the left SMG, even when combined with damage to left pSTP, does not lead to clinically significant auditory comprehension deficits. The second line of evidence comes from functional imaging. A number of PET20-24 and fMRI25-27 studies of the perception of auditorily presented speech stimuli ranging from individual words to sentences have been performed. These studies uniformly report activation of superior temporal lobe structures including STP/STG. These papers do not, however, uniformly report activation of the inferior parietal lobe28. Given that “phonemic processing” is undoubtedly being carried out when subjects listen to speech for comprehension, these results lead one to the conclusion that the inferior parietal lobe does not necessarily participate in speech perception as it is employed in natural auditory language understanding. These findings therefore confirm the conclusions drawn on the basis of the lesion data.
Similar arguments can be made regarding evidence used to suggest that Broca’s area participates in speech perception. Damage to Broca’s area is not associated with substantial auditory comprehension deficits14,29-31, and activation of Broca’s area in functional imaging studies is not uniformly observed when the task involves listening for comprehension22,24.
Several conclusions can be drawn from the range of studies discussed so far. (i) speech sound discrimination/identification involves a different set of neural systems than does speech perception as it is employed in the service of auditory language comprehension, (ii) there is both lesion and functional imaging evidence supporting the view that the left superior temporal lobe participates in speech perception, however, (iii) lesion evidence also makes a strong case for the view that the left superior temporal lobe is not the only region involved, and (iv) there is little evidence supporting the hypothesis that the left SMG or Broca’s area are necessary in speech perception during auditory language comprehension. We now turn to the third hypothesis, namely, that auditory cortex in the both the left and right temporal lobe participate in (or at least can support) speech perception to some extent5,6,32. We will show that this hypothesis not only accounts for the data reviewed so far, but also explains a host of other observations in a straightforward manner.
Hypothesis 3: Posterior Superior Temporal Lobe Bilaterally Supports Speech Perception
An early hypotheses concerning the possible role of the right hemisphere in speech perception comes from Karl Kleist. Kleist32 hypothesized that right auditory cortex in the superior temporal lobe could, at least in some cases, support speech perception when left auditory cortex is damaged. This claim was put forth as an explanation for the preserved auditory comprehension abilities in a case of conduction aphasia (Case Spratt):
The destruction of the left temporal lobe is so considerable that one would have expected a complete speech deafness. There is interruption of the auditory radiation, degeneration of the medial geniculate body, undermining of both transverse gyri and of the first temporal convolution, and damage to the second and third temporal convolutions. Comprehension of speech can thus only have been preserved through the function of the right temporal lobe. (pp. 47-48)
Benson, Geschwind and colleagues33 also invoked right hemisphere mechanisms as an explanation for preserved auditory comprehension in a case of conduction aphasia (their Case 2) with a similar left auditory cortex lesion. Thus, the idea that the right hemisphere is capable of supporting speech perception is not new, and has been put forth by several authors5,6,34,35.
If the right temporal lobe is capable of supporting speech perception with a reasonable degree of competence, we can then explain not only cases such as those described above in which auditory comprehension is clinically normal in the face of total destruction of left auditory cortices, but also the more general observation that unilateral damage anywhere in the left hemisphere does not seriously impair speech perception processes, even when auditory comprehension is substantially compromised. What we are suggesting, then, is that systems involved in mapping sound onto meaning are more strongly lateralized to the left hemisphere, hence one sees comprehension deficits following unilateral left hemisphere lesions, but systems involved in speech perception are more bilaterally organized, explaining the resilience of speech perception in the face of unilateral damage. At this point one might ask whether the right hemisphere participates in speech perception in the normal intact brain, or whether it just assumes a role in speech perception when left hemisphere systems are damaged. The latter possibility is what most previous authors seem to have in mind, but we propose that auditory cortices bilaterally participate in speech perception, even in neurologically intact individuals5,6.
This hypothesis makes several predictions. For example, bilateral lesions should seriously compromise speech perception, functional imaging studies of auditory speech perception should regularly produce bilateral activations, and one should in principle be able to demonstrate speech perception capacities of the isolated right hemisphere. All of these predictions are borne out in the literature.
EVIDENCE FOR BILATERAL ORGANIZATION OF SPEECH PERCEPTION SYSTEMS
Word Deafness
If speech perception systems are organized bilaterally, then we should predict that bilateral superior temporal lobe lesions will produce profound speech perception deficits. This is exactly the case in “pure word deafness” (henceforth, word deafness). Word deafness is a form of auditory agnosia in which the ability to comprehend auditorily presented speech is profoundly impaired. Patients are not deaf, and may have normal pure tone thresholds. Importantly, the auditory comprehension deficit in word deafness, unlike in Wernicke’s aphasia, appears to reflect lower-level acoustic speech perception difficulties:36,37 while Albert and Bear conclude that “phonemic misperception” is not the principle underlying deficit in impaired auditory comprehension in aphasia, they go on to say, “Pure word deafness, however, may represent an exception to this statement.” (p. 285). Word deafness, then, exhibits the profound speech-perception based deficit not found among the unilateral aphasias. The present hypothesis of bilateral organization of speech perception systems predicts that the lesions associated with such a condition should be bilateral, and indeed, the majority of word deaf cases present with bilateral temporal lobe lesions38.
Physiological Evidence
If speech perception is mediated bilaterally, it should be the case that listening to speech will activate temporal lobe structures bilaterally. This prediction is borne out in a range of functional imaging studies of passive perception of speech stimuli, including PET20-23, fMRI25-27, and MEG39-41. It is impossible to know precisely what aspect of the speech stimulus is producing the activations in the right hemisphere. Right hemisphere activation could reflect speech perception processes, non-linguistic “acoustic” perception processes, prosodic perception processes, or even inhibitory activity. Presumably because of lesion evidence demonstrating language deficits following left but not right hemisphere damage, most investigators attribute right hemisphere activity to some non-linguistic (or at least suprasegmental) process. But what we have tried to argue above is that lesion evidence suggests that speech perception, in contrast to higher-order linguistic processes, is not strongly lateralized. If true, this argument removes any empirical justification for dismissing right hemisphere activations as irrelevant to the speech perception process.
A different line of empirical observation suggests, in fact, that right hemisphere activation in speech perception tasks may reflect speech perception processes. Intraoperative recordings from single units in awake patients undergoing surgical treatment for epilepsy have suggested the existence of cells in superior temporal lobe which are responsive to different aspects of the speech signal such as a particular class of phonemic clusters, mono- vs. multisyllabic words, task-relevant vs. task-irrelevant speech, and natural vs. distorted or backwards speech (Creutzfeldt et al., 1989). No hemispheric asymmetries were found in the distribution of these units. Some caution is warranted, however, in interpreting these findings because clinical constraints precluded detailed and fully controlled experimentation.
The Isolated Right Hemisphere
If the right hemisphere has the capacity to carry out speech perception processes, it should be possible to demonstrate speech perception abilities in the isolated right hemisphere. Studies of split brain patients and carotid amobarbital injection studies both indicate that in some cases at least, the isolated right hemisphere has the ability to understand simple (i.e., not syntactically complex) speech35,42,43, and even where comprehension errors do occur, there is some evidence35 to indicate that the errors are semantic in nature, not sound-based, just as is found in aphasia. Further evidence comes from a recent amobarbital study which has demonstrated that the isolated right hemisphere can perform speech discrimination tasks6.
Taken together, these sources of evidence make a strong case for bilateral organization of speech perception systems. However, while both hemispheres appear to contribute to the speech perception process, they likely make differing contributions (Box 1).
PRIVILEGED ROLE OF THE POSTERIOR SUPERIOR TEMPORAL LOBE IN SPEECH PERCEPTION
The evidence reviewed above highlights the bilateral organization of speech perception systems. Many of the those studies also seem to implicate the pSTP/pSTG in particular as the critical structures. The relevant evidence includes the following. Word deafness is most often associated with (bilateral) lesions involving the posterior superior temporal lobe.38 Bilateral posterior MTG lesions do not appear to have the same effect: Nielsen44 sites such a case (p. 123) in which the lesions affected the posterior one-half of the MTG, with only partial extension into the white matter of the STG. Nielsen reports that “after recovery he was able at all times to understand spoken language perfectly.” In Creutzfeldt et al.’s intraoperative single-unit recording studies, the vast majority of speech responsive cells were found in the middle portion of the STG (recordings from more posterior sites were not made); very few sites in anterior STG, middle or inferior temporal gyrus yielded detectable responses. In functional activation studies involving passive listening to speech, the most consistent activations tend to cluster around the pSTP and pSTG24. Finally, in a study45 using sub-durally placed grids, with which a temporary inactivation of the underlying tissue can be induced, it was observed that, across patients, stimulation sites along the pSTG consistently compromised performance on speech perception tasks. A variety of sites in superior temporal cortex as well as more inferior temporal lobe sites were tested but only pSTG sites were associated in every patient with speech perception deficits.
INTERFACES WITH THE SPEECH PERCEPTION SYSTEM
In the present model, posterior superior temporal cortex bilaterally supports the construction of sound-based representations of the speech signal. Beyond this initial elaboration of the input signal, there are at least two “interfaces” that these sound-based representations must enter into. On the one hand, listeners need to recover conceptual-semantic information in order to derive meaningful interpretations from the signal. Conceptual-semantic information is typically assumed to be stored in a widely distributed network throughout the cerebral cortex46. The multi-modal cortical fields in the vicinity of the left temporal-parietal-occipital junction are a good candidate for networks important for interfacing sound-based representations in auditory cortex with conceptual-semantic representations, for both perception and production14,47 -- this may not be a single focal network, but rather a system of networks whose spatial distribution of activation varies as a function of the cortical distribution of conceptual representations onto which a particular speech input is being mapped. This interface system may correspond to the “lemma” level of representation in psycholinguistic models48, in the sense that it serves to bind together different types of information, such as phonemic, semantic, and although not discussed here, perhaps syntactic information, all of which together define the content of individual entries in the mental lexicon.
On the other hand, speech signals must, in some instances, be interfaced with articulatory information. The existence of an auditory-motor interface is most clearly indicated by one’s ability to repeat heard pseudowords (semantic mediation is impossible), but there is also evidence from lesion work and neuroimaging that such an interface is part of the normal speech production process (Box 2), and some investigators have argued that motor representations participate in speech perception per se49. Assuming that the frontal cortical fields implicated in speech are involved in articulatory processes (not necessarily for overt speech only), a reasonable hypothesis is that the inferior parietal lobe contains an important (but not necessarily the only) interface system mediating between auditory and articulatory representations50 . Parietal systems have been argued to play a privileged role in sensorimotor transformations in a variety of contexts51, and a similar role for the SMG in speech would bring this region in line functionally with other parietal areas. Furthermore, it may provide a principled account of the functional anatomy of phonological working memory50. Phonological working memory -- comprised of an “articulatory subvocal rehearsal” component and a “phonological store” component52 -- can be thought of as a system which uses articulatory mechanisms to keep phonological representations active; it therefore requires a sensory-motor interface. The subvocal rehearsal component appears to be supported by left frontal networks, whereas the phonological store mechanism relies on cortex in the left parietal lobe53,54. We hypothesize that activations attributed to the functions of the phonological store reflect the operations of the proposed auditory-motor interface system in inferior parietal cortex. That is, inferior parietal cortex is not the site of storage of phonological information per se, but rather serves to interface sound-based representations of speech in auditory cortex with articulatory-based representations of speech in frontal cortex. To the extent that speech discrimination and identification tasks rely on this frontoparietal phonological working memory system, the present account could explain why speech-sound perception deficits have been associated with frontal and parietal lobe damage. Although parietal cortex appears to play a significant role in auditory-motor interaction, there are likely other, non-parietal networks which also function as an interface between auditory and motor representations19,55,56, perhaps in the service of other language functions (e.g., normal speech production). (See Figure 2.)
-----------------------
Figure 2 about here
-----------------------
Conclusions
We have argued (i) that the posterior superior temporal lobe bilaterally constitutes the primary cortical substrate for speech perception, (ii) that while both hemispheres participate, they likely contribute differently to speech perception (Box 1), (iii) that left hemisphere frontal and parietal regions previously implicated in speech perception (and also in phonological working memory) may be understood in terms of a system supporting auditory-motor interaction, on analogy to similar systems supporting visual-motor interaction, and (iv) that unimodal auditory cortex in the left pSTP/pSTG participates in aspects of speech production (Box 2). The picture that emerges from these considerations is one in which pSTP/pSTG auditory systems bilaterally are important for assembling sound-based representations which interface both with left hemisphere motor speech-production systems via the parietal lobe, and with lexical-semantic representations via cortical fields in the vicinity of the left temporal-parietal-occipital junction.
Acknowledgments
The work was supported by NIH DC-0361 (GH), the McDonnell-Pew program, the National Science Foundation LIS initiative, and the University of Maryland Center of Comparative Neuroscience (DP).
References
1
Wernicke, C. (1874/1977) in Wernicke’s works on aphasia: A sourcebook and review (Eggert, G.H., ed.), Mouton
2
Luria, A.R. (1970) Traumatic aphasia, Mouton
3
Caplan, D., Gow, D. and Makris, N. (1995) Analysis of lesions by MRI in stroke patients with acoustic-phonetic processing deficits, Neurology 45, 293-298
4
Fiez, J.A. et al. (1995) PET studies of auditory and phonological processing: Effects of stimulus characteristics and task demands, Journal of Cognitive Neuroscience 7, 357-375
5
Poeppel, D. (1995) The neural basis of speech perception, Unpublished doctoral dissertation, MIT
6
Boatman, D. et al. (1998) Right hemisphere speech perception revealed by amobarbital injection and electrical interference, Neurology 51, 458-464
7
Blumstein, S. (1995) in The cognitive neurosciences (Gazzaniga, M.S., ed.), pp. 913-929, MIT Press
8
Liberman, A.M. and Mattingly, I.G. (1989) A specialization for speech perception, Science 243, 489-494
9
Blumstein, S.E., Baker, E. and Goodglass, H. (1977) Phonological factors in auditory comprehension in aphasia, Neuropsychologia 15, 19-30
10
Goodglass, H. and Kaplan, E. (1976) The assessment of aphasia and related disorders., Lea & Febiger
11
Baker, E., Blumsteim, S.E. and Goodglass, H. (1981) Interaction between phonological and semantic factors in auditory comprehension, Neuropsychologia 19, 1-15
12
Bachman, D.L. and Albert, M.L. (1988) in Handbook of neuropsychology, Vol. 1 (Boller, F. and Grafman, J., eds), Elsevier
13
Alexander, M.P. (1997) in Behavioral neurology and neuropsychology (Feinberg, T.E. and Farah, M.J., eds), pp. 133-149, McGraw-Hill
14
Damasio, A.R. (1992) Aphasia, New England Journal of Medicine 326, 531-539
15
Damasio, H. (1991) in Acquired aphasia, pp. 45-71, Academic Press
16
Geschwind, N. (1965) Disconnexion syndromes in animals and man, Brain 88, 237-294, 585-644
17
Palumbo, C.L., Alexander, M.P. and Naeser, M.A. (1992) in Conduction aphasia (Kohn, S.E., ed.), pp. 51-75, Lawrence Erlbaum Associates
18
Damasio, H. and Damasio, A.R. (1983) in Localization in neuropsychology (Kertesz, A., ed.), pp. 231-243, Academic Press
19
Damasio, H. and Damasio, A.R. (1980) The anatomical basis of conduction aphasia, Brain 103, 337-350
20
Zatorre, R.J., Evans, A.C., Meyer, E. and Gjedde, A. (1992) Lateralization of phonetic and pitch discrimination in speech processing, Science 256, 846-849
21
Petersen, S.E. et al. (1988) Positron emission tomographic studies of the cortical anatomy of single-word processing, Nature 331, 585-589
22
Price, C.J. et al. (1996) Hearing and saying: The functional neuro-anatomy of auditory word processing, Brain 119, 919-931
23
Mazoyer, B.M. et al. (1993) The cortical representation of speech, Journal of Cognitive Neuroscience 5, 467-479
24
Zatorre, R.J., Meyer, E., Gjedde, A. and Evans, A.C. (1996) PET studies of phonetic processing of speech: Review, replication, and reanalysis, Cerebral Cortex 6, 21-30
25
Schlosser, M.J. et al. (1998) Functional MRI studies of auditory comprehension, Human Brain Mapping 6, 1-13
26
Calvert, G.A. et al. (1997) Activation of auditory cortex during silent lipreading, Science 276, 593-596
27
Binder, J.R. et al. (1994) Functional magnetic resonance imaging of human auditory cortex, Annals of Neurology 35, 662-672
28
Fiez, J.A. et al. (1996) PET activation of posterior temporal regions during auditory word presentation and verb generation, Cerebral Cortex 6, 1-10
29
Goodglass, H. (1993) Understanding aphasia, Academic Press
30
Alexander, M.P., Naeser, M.A. and Palumbo, C. (1990) Broca’s area aphasias: Aphasia after lesions including the frontal operculum, Neurology 40, 353-362
31
Mohr, J.P. et al. (1978) Broca’s aphasia: Pathological and clinical, Neurology 28, 311-324
32
Kleist, K. (1962) Sensory aphasia and amusia: The myeloarchitectonic basis, Pergamon Press
33
Benson, D.F. et al. (1973) Conduction aphasia: A clincopathological study, Archives of Neurology 28, 339-346
34
Code, C. (1987) Language, aphasia, and the right hemisphere, John Wiley & Sons
35
Zaidel, E. (1985) in The dual brain: Hemispheric specialization in humans (Benson, D.F. and Zaidel, E., eds), pp. 205-231, Guilford Press
36
Yaqub, B.A., Gascon, G.G., Alnosha, M. and Whitaker, H. (1988) Pure word deafness (acquired verbal auditory agnosia) in an Arabic speaking patient, Brain 111, 457-466
37
Albert, M.L. and Bear, D. (1974) Time to understand; a case study of word deafness with reference to the role of time in auditory comprehension, Brain 97, 373-384
38
Buchman, A.S. et al. (1986) Word deafness: One hundred years later, Journal of Neurology, Neurosurgury, and Psychiatry 49, 489-499
39
Poeppel, D. et al. (1996) Task-induced asymmetry of the auditory evoked M100 neuromagnetic field elicited by speech sounds., Cognitive Brain Research 4, 231-242
40
Gage, N., Poeppel, D., Roberts, T.P.L. and Hickok, G. (1998) Auditory evoked M100 reflects onset acoustics of speech sounds, Brain Research 814, 236-239
41
Kuriki, S., Okita, Y. and Hirata, Y. (1995) Source analysis of magnetic field responses from the human auditory cortex elicited by short speech sounds, Experimental Brain Research 104, 144-152
42
McGlone, J. (1984) Speech comprehension after unilateral injection of sodium amytal, Brain and Language 22, 150-157
43
Wada, J. and Rasmussen, T. (1960) Intracarotid injection of sodium amytal for the lateralization of cerbral speech dominance, journal of Neurosurgery 17, 266-282
44
Nielsen, J.M. (1946) Agnosia, apraxia, aphasia: Their value in cerebral localization., Paul B. Hoeber, Inc.
45
Boatman, D., Lesser, R.P. and Gordon, B. (1995) Auditory speech processing in the left temporal lobe: An electrical interference study, Brain and Langauge 51, 269-290
46
Damasio, A.R. (1989) The brain binds entities and events by multiregional activation from convergence zones, Neural Computation 1, 123-132
47
Mesulam, M.-M. (1998) From sensation to cognition, Brain 121, 1013-1052
48
Levelt, W.J.M. (1999) Models of word production, Trends in Cognitive Sciences 3, 223-232
49
Liberman, A.M. and Mattingly, I.G. (1985) The motor theory of speech perception revised., Cognition 21, 1-36
50
Aboitiz, F. and GarcÃa V., R. (1997) The evolutionary origin of language areas in the human brain. A neuroanatomical perspective., Brain Research Reviews 25, 381-396
51
Andersen, R. (1997) Multimodal integration for the representation of space in the posterior parietal cortex, Philos Trans R Soc Lond B Biol Sci 352, 1421-1428
52
Baddeley, A.D. (1992) Working memory, Science 255, 556-559
53
Awh, E. et al. (1996) Dissociation of storage and rehersal in working memory: PET evidence, Psychological Science 7, 25-31
54
Jonides, J. et al. (1998) The role of parietal cortex in verbal working memory, The Journal of Neuroscience 18, 5026-5034
55
Dronkers, N.F. (1996) A new brain region for coordinating speech articulation, Nature 384, 159-161
56
Romanski, L.M., Bates, J.F. and Goldman-Rakic, P.S. (1999) Auditory belt and parabelt projections to the prefrontal cortex in the Rhesus monkey, Journal of Comparative Neurology 403, 141-157
57
Galaburda, A. and Sanides, F. (1980) Cytoarchitectonic organization of the human auditory cortex, Journal of Comparative Neurology 190, 597-610
Figures
Figure 1.
Lateral view of the left hemisphere with structures inside the Sylvian fossa exposed. H, transverse temporal gyrus of Heschl (which houses primary auditory cortex); PT, planum temporale; aSTP, anterior supratemporal plane; STG, superior temporal gyrus, note that the STG includes the STP; MTG, middle temporal gyrus; AG, angular gyrus; SMG, supramarginal gyrus; 44 and 45 refer to Brodmann’s designations and together comprise Broca’s area; PCG, pre-central gyrus (primary motor cortex); FO, frontal operculum; PO, parietal operculum; I, insula. Colors correspond to a rough cytoarchitectonic parcellation57 into primary, koniocortical fields (gray), secondary, parakoniocortex fields (blue), and fields which are “transitional” between unimodal sensory cortex and multimodal “integration” cortex (pink).
Figure 2.
A simple model of the cortical network supporting speech perception and related language functions. The dashed line indicates the possibility of additional, non-parietal auditory-motor interface networks (see text).
Outstanding Questions
1. We have proposed that superior temporal lobe structures play an important role in constructing “sound-based representations of speech.” This process is complex, probably involving multiple levels of representation. What are the processing subcomponents of this speech perception system? How are they organized in auditory cortex? How do these presumed subsystems map onto the different linguistic levels of representation (e.g., phonetic features, syllabic structure, etc.)?
2. It is common for researchers to draw a distinction between “acoustic” and “phonological” processes in auditory perception. Are there separate processing streams for linguistic versus non-linguistic input in auditory cortex? If so, at what point, anatomically and functionally, do they diverge?
Box 1:
DIFFERING CONTRIBUTION TO SPEECH PERCEPTION BY THE LEFT AND RIGHT TEMPORAL CORTICES
Although both left and right superior temporal cortices are implicated in speech perception, the evidence suggests that the left and right systems make different contributions to perceptual analysis. For example, a lesion study by Robin et al.a showed that patients with left auditory cortex lesions were impaired on tasks involving the perception of temporal information, but unimpaired on tasks involving the perception of spectral information, whereas patients with right auditory cortex lesions showed the reverse effect. Zaidelb reports that the perception of speech sounds by the isolated left or right hemispheres of split brain subjects is differentially affected by the addition of noise to the speech signal: noise adversely affected the performance of the right hemisphere to a greater degree than the performance of the left hemisphere. Differential effects of background noise on the cortical response in the two hemispheres have also been seen using MEGc. Other MEG studies have observed response differences in the two hemispheres that vary as a function of differences in speech stimuli. For example, the latency of the major response of auditory cortex elicited by words varies in the two hemispheres as a function of onset properties of the wordd. Using PET, Belin and colleaguese investigated the response to rapidly (40ms) versus slowly (200ms) changing acoustic transitions in speech-like stimuli. They observed significantly different responses in the auditory cortices, with the left auditory cortex activation being much more extensive in the rapid-transition condition (versus bilaterally symmetric in the slow-transition condition). A recent review of psychophysical and clinical research on processing asymmetries also highlights hemispheric differences in the degree to which the temporal structure of events can be discriminated, with the left hemisphere being selectively better at the discrimination of fine temporal eventsf.
Although the computational basis of these hemispheric differences remains to be determined, several models have been proposed. Ivry and Robertsong have developed the double-filtering by frequency model (DFF) to account for lateralization in auditory and visual perception. In DFF, all perceptual analysis begins with a spectral representation of the stimulus. An attentional filter then determines that point in the frequency domain that is relevant for the analysis of the signal. Around this anchoring point, the percept is processed asymmetrically, with the high-frequency part of the signal (with respect to the attentionally defined anchoring point) being processed in the left hemisphere, the low-frequency part in the right hemisphere. An alternative position is held by Zatorreh who suggests that the left hemisphere is selectively better at temporal analysis but therefore has to sacrifice on spectral resolving power, whereas the right hemisphere is better at spectral analysis. On a third view, the asymmetric sampling in time (AST) model, the left and right temporal cortices analyze the same input signal on differing time scales. In particular, the temporal window of integration is argued to be short for the left hemisphere (on the order of 25ms) and long for the right hemisphere (150-250ms). The observed processing asymmetries for speech and non-speech follow from the asymmetric sampling of the waveform in the time domain.
a
Robin, D.A., Tranel, D., Damasio, H. (1990). Auditory perception of temporal and spectral events in patients with focal left and right cerebral lesions. Brain and Language, 39, 539-555.
b
Zaidel, E. (1985). Language in the right hemisphere. In The dual brain: Hemispheric specialization in humans, D. F. Benson and E. Zaidel, eds. (New York: Guilford Press), pp. 205-231.
c
Shtyrov, Y., Kujala, T., Ahveninen, J., Tervaniemi, M., Alku, P., Ilmoniemi, R.J., Näätänen, R. (1998). Background acoustic noise and the hemispheric lateralization of speech processing in the human brain: magnetic mismatch negativity study. Neuroscience Letters, 251, 141-144.
d
Gage, N., Poeppel, D., Roberts, T.P.L. and Hickok, G. (1998) Auditory evoked M100 reflects onset acoustics of speech sounds, Brain Research 814, 236-239
e
Belin, P Zilbovicius, M Crozier, S Thivard, L Fontaine, A Masure, M-C, & Samson Y (1998). Lateralization of speech and auditory temporal processing. Jounal of Cognitive Neuroscience, 10, 536-540.
f
Nicholls, M.E.R. (1996). Temporal processing asymmetries between the cerebral hemispheres: evidence and implications. Laterality 1(2): 97-137.
g
Ivry, R and Robertson, L (1998). The two sides of perception. Cambridge MA: MIT Press.
h
Zatorre, R.J. (1997). Cerebral correlates of human auditory processing:
Perception of speech and musical sounds. In J. Syka (Ed.) Acoustical Signal
Processing in the Central Auditory System. Plenum Press.
BOX 2:
A ROLE FOR LEFT AUDITORY CORTEX IN SPEECH PRODUCTION
Early models of the neurology of language hypothesized the involvement of auditory cortex in speech productiona. Specifically, the same auditory networks involved in speech perception were thought to be activated during speech production. This claim was based on the observation that aphasics with superior temporal lobe lesions often presented with disordered (i.e., paraphasic) speech production. Recent functional imaging evidence has confirmed that left auditory cortex, including unimodal fields, is indeed activated during speech production. Using PET, Paus and colleguesb for example, demonstrated a correlation between rate of speech production (whispering “ba-lu” repeatedly, with white noise masking auditory feedback) and activation of left secondary auditory cortex posterior to Heschl’s gyrus, suggesting a relation between “phonological load” and auditory cortex activity. Another studyc used magnetoencephalography to gauge the timecourse and spatial distribution of activations during an object naming task. The investigators noted a cluster of activation in the vicinity of the left supratemporal plane which emerged in a time window (275-400 msec after stimulus onset) corresponding to when “phonological encoding” during object naming is thought to occurc. A recent fMRI study of silent object naming also reported activation in the left pSTPd (see figure). These findings are consistent with the hypothesis that the left pSTP participates, not only in speech perception, but also in some sort of phonemic processes during speech productione. The functional significance of these activations is demonstrated by the finding that lesions to the left pSTP is associated with conduction aphasiaf, a syndrome which, because of the prevalence of phonemic errors in speech output, has been characterized by some as an impairment of phonemic encoding.
Figure legend
Activation of left auditory cortex in the pSTP during a silent object naming task.
a
Wernicke, C. (1874/1977) in Wernicke’s works on aphasia: A sourcebook and review (Eggert, G.H., ed.), Mouton
b
Paus, T. et al. (1996) Modulation of cerebral blood flow in the human auditory cortex during speech: Role of motor-to-sensory discharges, European Journal of Neuroscience 8, 2236-2246
c
Levelt, W.J.M. et al. (1998) An MEG study of picture naming, Journal of Cognitive Neuroscience 10, 553-567
d
Hickok, G. et al. (1999) Auditory cortex participates in speech production, Cognitive Neuroscience Society Abstracts , 97
e
Hickok, G. (in press) in Language and the brain (Grodzinsky, Y., Shapiro, L. and Swinney, D., eds), Academic Press
f
Damasio, H. and Damasio, A.R. (1980) The anatomical basis of conduction aphasia, Brain 103, 337-350
Box 3:
SPEECH PERCEPTION, SPEECH PRODUCTION, AND THE FLUENT APHASIAS
Conduction aphasia (CA) is an acquired disorder of language characterized by good auditory comprehension, fluent speech production, relatively poor speech repetition, frequent phonemic errors in production, and naming difficultiesa,b. Although the repetition disorder has gained clinical prominance in the diagnosis of CA, phonemic errors are frequently observed not only in repetition tasks, but also in spontaneous speech, oral reading, and naming tasksa. At least two forms of CA have been identified behaviorallyc. One, referred to as reproduction CA, appears to reflect a primary deficit in phonological encoding for productionc,d. The other, referred to as repetition CA, is thought to derive from a deficit involving primarily verbal working memoryc. In addition, two lesions patterns have been associated with CA. The pattern which is most commonly highlighted, is one involving damage to the SMG, and is classically thought to cause CA by interrupting the white matter pathway connecting posterior and anterior language systemse. But CA can also be caused by a lesion involving left auditory cortices, and the insula, sometimes with the SMG fully sparedf. The relation between the two behavioral and lesion patterns associated with CA has not been studied directly, but evidence discussed in this review motivates a hypothesis. The left SMG appears to participate in verbal working memoryg-i, and we have suggested here that left auditory cortices pariticpate in phonemic-aspects of speech production (Box 2). A hypothesis consistent with this set of facts is that reproduction CA is a phonemic-level processing disorder caused by damage to the auditory complex in the left supratemporal plane, whereas repetition CA is a verbal working memory disorder caused by damage to left inferior parietal lobe structures. Neither of these types of CA rely on the classic notion of a disconnection syndrome in their explanation. Comprehension in CA is relatively intact (i) because right hemisphere systems can support speech perception to a reasonable degree in cases where left hemisphere speech perception systems are damaged, and (ii) because cortical fields around the left temporal-parietal-occipital junction, which are important for interfacing sound-based representations with conceptual-semantic representations, are intact. When these sound-meaning interface systems are damaged in addition to the auditory cortex lesion, comprehension is compromised and output contains both phonemic-and semantic-level errors, that is, Wernicke’s aphasia emerges. Finally, if a lesion involves these hypothesized sound-meanging mapping systems, but spares the pSTP/pSTG, comprehension is compromised, semantic errors in production are apparant, but phonemic-level errors in production are minimized (and verbatim repetition is possible), that is, transcortical sensory aphasia (TCSA) emerges. Thus the present framework provides an account of the fluent aphasias: reproduction CA can be viewed as a disorder at the phonemic level, TCSA can be viewed as a disorder at the level of the sound-meaning interface, and Wernicke’s aphasia can be viewed as a combination of the two.
a
Goodglass, H. (1992) in Conduction aphasia (Kohn, S.E., ed.), pp. 39-49, Lawrence Erlbaum Associates
b
Damasio, A.R. (1992) Aphasia, New England Journal of Medicine 326, 531-539
c
Shallice, T. and Warrington, E. (1977) Auditory-verbal short-term memory impairment and conduction aphasia, Brain and Language 4, 479-491
d
Wilshire, C.E. and McCarthy, R.A. (1996) Experimental investigations of an impairement in phonological encoding, Cognitive Neuropsychology 13, 1059-1098
e
Geschwind, N. (1965) Disconnexion syndromes in animals and man, Brain 88, 237-294, 585-644
f
Damasio, H. and Damasio, A.R. (1983) in Localization in neuropsychology (Kertesz, A., ed.), pp. 231-243, Academic Press
g
Jonides, J. et al. (1998) The role of parietal cortex in verbal working memory, The Journal of Neuroscience 18, 5026-5034
h
Awh, E. et al. (1996) Dissociation of storage and rehersal in working memory: PET evidence, Psychological Science 7, 25-31
i
Paulesu, E., Frith, C.D. and Frackowiak, R.S.J. (1993) The neural correlates of the verbal component of working memory, Nature 362, 342-345