Talking Brains: commentary

Showing posts with label commentary. Show all posts

Thursday, May 6, 2010

Gesture discrimination deficits implicate temporal-parietal regions not the "mirror system"

A new study in J. Neuroscience failed to replicate a previous finding published in the same journal that linked gesture discrimination deficits to tissue loss in the inferior frontal gyrus, part of the supposed human mirror system. The new study, by Nelissen et al., examined correlations between gesture discrimination (and a range of other language and non-language tasks) and patterns of tissue degeneration in primary progressive aphasia (PPA). They found that gesture discrimination did not correlate with tissue loss in the inferior frontal gyrus, and instead correlated with tissue loss in posterior temporal-parietal regions including portions of the superior temporal gyrus, which is not part of the human "mirror system" (see below which is Fig 2A from Nelissen et al.).

This is from the same group that published the previous stroke-based study that reported an association between tissue damage in the inferior frontal gyrus (part of the "mirror system") and gesture discrimination deficits (see my critique of that study here, or in Hickok, 2009). The same stimuli were used in both studies, so why the difference?

I had pointed out that the stroke study was potentially biased because they used percent correct for their gesture discrimination-brain lesion correlation rather than d' (d-prime), which corrects for response bias. The previous study used a yes/no response paradigm: subjects saw a gesture and had to decide whether it was correctly executed or not. Using percent correct is a problem because some subjects (perhaps as a function of their lesion!) may be biased toward yes or no responses which can skew the results independently of how well they are actually discriminating the gestures. The present study used the same stimuli but with a modified design, 3-alternative forced choice: subjects viewed three gestures and then had to decide which of the three was correctly executed. A still frame of the gesture was left on the screen to minimize working memory effects. This paradigm reduces bias especially when the order and position of the correct item is counterbalanced as was done by Nelissen et al. It appears then, that reducing the bias shifted the brain region that showed a correlation with gesture discrimination performance.

There is much more to this study than what I've highlighted here, including their finding that measures of gesture and language processing are highly correlated. But that's a topic for another blog entry.

References

Hickok G (2009). Eight problems for the mirror neuron theory of action understanding in monkeys and humans. Journal of cognitive neuroscience, 21 (7), 1229-43 PMID: 19199415

Nelissen, N., Pazzaglia, M., Vandenbulcke, M., Sunaert, S., Fannes, K., Dupont, P., Aglioti, S., & Vandenberghe, R. (2010). Gesture Discrimination in Primary Progressive Aphasia: The Intersection between Gesture and Language Processing Pathways Journal of Neuroscience, 30 (18), 6334-6341 DOI: 10.1523/JNEUROSCI.0321-10.2010

Friday, April 30, 2010

Auditory short-term memory and the left superior temporal gyrus

Where is the "phonological store"? Ask the typical cognitive neuroscientist on the street and you will probably be pointed to the left inferior parietal lobe. But this is incorrect. First, the idea that there is a dedicated "phonological store" is probably incorrect. Second, the system that supports the temporary maintenance of phonological information isn't in the parietal lobe, but in the superior temporal region, i.e., the same general region that supports phonological processing during speech recognition. These claims have been made on the basis of functional imaging data (e.g., Hickok et al., 2003; Buchsbaum & D'Esposito, 2008). But now there's lesion evidence to back it up.

Leff et al. (2009) studied a whopping 210 stroke patients, testing them on a range of speech tasks and their measure of auditory STM, digit span. Not surprisingly, damage to pretty much the whole left perisylvian cortex correlated with digit span measures (figure below, top row of images). But when the authors factored out processes such as speech production (naming measure) single word speech repetition (auditory word and nonword repetition) and higher-level functions (verbal fluency measure), the remaining correlate of digit span was a relatively small sector of the STG/STS (figure below, bottom row of images). This region was further shown to correlate with auditory (but not visual) sentence comprehension; it did not correlate with auditory word comprehension.

What this shows is that the left STG/STS is critical for auditory STM. What is less clear, though, is how this region relates to systems involved in phonological processing during normal speech recognition. The issue centers on whether short term maintenance of phonological information is achieved by activating the same phonological processing networks that are involved in speech recognition or whether there is a separate "store". The fact that the left STG/STS region identified by Leff et al. did not correlate with auditory word comprehension seems to suggest a separate phonological store. However, this isn't necessarily the case. For example, if phonological maintenance involves only a sub-portion of the phonological recognition network -- e.g., if the recognition system were bilateral as we've argued -- then maintenance and recognition may dissociate, so the non-correlation with auditory word comprehension is not surprising. Why does left but not right STG/STS damage cause STM deficits? Because STM is dependent on connections with the motor speech system, which is strongly left dominant.

What seems more puzzling though for a common network model is the fact that the left STG/STS region correlated with digit span even when nonword repetition was factored out. That is, nonword repetition requires accurate phonemic perception of the stimulus and interface with the motor speech system, which should implicate left phonological processing systems. So how can damage to left STG/STS affect digit span but not nonword repetition? One possibility is that the damage to left STG/STS represents partial damage to the phonological system within the left hemisphere, perhaps specifically involving phonological sub-networks that represent larger phonological chunks or sequences; or maybe it produces just enough damage to affect the more difficult processes while leaving the easier tasks spared.

Generally, this finding provides fairly compelling evidence that the left STG/STS plays a critical role in auditory/phonological STM.

References

Buchsbaum, B.R., and D'Esposito, M. (2008). The search for the phonological store: from loop to convolution. J Cogn Neurosci 20, 762-778.

Hickok, G., Buchsbaum, B., Humphries, C., and Muftuler, T. (2003). Auditory-motor interaction revealed by fMRI: Speech, music, and working memory in area Spt. Journal of Cognitive Neuroscience 15, 673-682.

Leff, A., Schofield, T., Crinion, J., Seghier, M., Grogan, A., Green, D., & Price, C. (2009). The left superior temporal gyrus is a shared substrate for auditory short-term memory and speech comprehension: evidence from 210 patients with stroke Brain, 132 (12), 3401-3410 DOI: 10.1093/brain/awp273

Tuesday, April 13, 2010

Recognizing facial expressions without the capacity to produce them -- Moebius Syndrome

Former TB West grad student, now Rotman Institute faculty member Brad Buchsbaum, pointed me to this interesting NY Times article on Moebius Syndrome, a congenital disorder that causes facial paralysis. Much of the article focuses on the social impact of the inability to express emotions on the face. Of particular interest here however, is the reference in the article to a new study by Kathleen Rivas Bogart and David Matsumoto on the recognition of emotional facial expression by people with Moebius syndrome. These authors report no difference between individuals with Moebius syndrome and controls in the ability to recognize emotional facial expressions. This reinforces what we've been saying here for a long time now, that in contrast to the central claim of mirror neuron theorists, you don't need to be capable of generating an action to recognize/understand that action in others.

Wednesday, March 31, 2010

On grandmother cells and parallel distributed models

Jeff Bowers has published a paper or two arguing for the viability of grandmother cells -- cells that represent whole "objects" such as a specific face (or your grandmother's face). At issue, of course, is whether the brain represents information in a localist or distributed fashion and Jeff has used his case for grandmother cells as evidence against a basic assumption of parallel distributed processing (PDP) models. But the PDP folks don't seem to think "distributed" is a necessary property of PDP models. So in the guest post below, Jeff asks, What does the D in PDP actually mean? This is an interesting question, and Jeff would like to know your thoughts (see the new survey to respond). I'd also be interested in your thoughts on grandmother cells!

Guest Post from Jeff Bowers:
I’ve been involved in a recent debate regarding the relative merits of localist representations and the distributed representations learned in Parallel Distributed Processing (PDP) networks. By localist, I mean something like a word unit in the interactive activation (IA) model – a unit that represents a specific word (like a “grandmother cell”). By distributed, I mean that a familiar word (or an object or a face, etc.) is coded as a pattern of activation across a set of units, with no single unit sufficient for representing an item (you need to consider the complete pattern). In Bowers (2009, 2010) I argue that the neuroscience is more consistent with localist coding compared to the distributed representations in PDP networks, contrary to the widespread assumption in the cognitive science community. That is, single-cell recordings of neurons in cortex and hippocampus often reveal neurons that are remarkably selective in their responding (e.g., a neuron that responds to one face out of many). I took this to be more consistent with localist compared to distributed PDP theories.

This post, however, is not with regards to whether localist or PDP models are more biological plausible. Rather, I’m curious as to what people think is the theory behind PDP models; specifically, what is your understanding regarding the relation between distributed representations and PDP models? In Bowers (2009, 2010) I claim that PDP models are committed to the claim that information is coded in a distributed format rather than a localist format. On this view, the IA model of word identification that includes single units to code for specific words (e.g., a DOG unit) is not a PDP model. Neither are neural networks that learn localist representations, like the ART models of Grossberg. On my understanding, a key (necessary) feature of the Seidenberg and McClelland model of word naming that makes it part of the PDP family is that it learns distributed representations of words – it gets rid of localist word representations.
However, Plaut and McClelland (2010) challenge this characterization of PDP models. That is, they write:

In accounting for human behavior, one aspect of PDP models
that is especially critical is their reliance on interactivity and
graded constraint satisfaction to derive an interpretation of an input
or to select an action that is maximally consistent with all of the
system’s knowledge (as encoded in connection weights between
units). In this regard, models with local and distributed representations
can be very similar, and a number of localist models remain
highly useful and influential (e.g., Dell, 1986; McClelland &
Elman, 1986; McClelland & Rumlehart, 1981; McRae, Spivey-
Knowlton, & Tenenhaus, 1998). In fact, given their clear and
extensive reliance on parallel distributed processing, we think it
makes perfect sense to speak of localist PDP models alongside
distributed ones. (p 289).

That is, they argue that the PDP approach is not in fact committed to distributed representations. Elsewhere they write:

In fact, the approach takes no specific stance on the number of units that
should be active in representing a given entity or in the degree
of similarity of the entities to which a given unit responds.
Rather, one of the main tenets of the approach is to discover
rather than stipulate representations (p. 286)

So on this view, the PDP approach does not rule out the possibility that a neural network might actually learn localist grandmother cells in the appropriate training conditions.

With this as background, I would be interested in people’s views on this. Here is my question:

Are PDP theories of cognition committed to the claim that knowledge is coded in a distributed rather than a localist format? [see new survey]

Thanks for your thoughts,

Jeff

References

Bowers JS (2009). On the biological plausibility of grandmother cells: implications for neural network theories in psychology and neuroscience. Psychological review, 116 (1), 220-51 PMID: 19159155

Bowers JS (2010). More on grandmother cells and the biological implausibility of PDP models of cognition: a reply to Plaut and McClelland (2010) and Quian Quiroga and Kreiman (2010). Psychological review, 117 (1) PMID: 20063980

Plaut, D., & McClelland, J. (2010). Locating object knowledge in the brain: Comment on Bowers’s (2009) attempt to revive the grandmother cell hypothesis. Psychological Review, 117 (1), 284-288 DOI: 10.1037/a0017101

Friday, March 26, 2010

Self-destruction of the mirror neuron theory of action understanding

Rizzolatti & Sinigaglia's new Nature Reviews Neuroscience paper on the mirror system is effectively an admission that the mirror neuron theory of action understanding is wrong. The original is idea was interesting: we understand actions by mirroring those actions in our own motor system. But this is no longer the case according to R&S:

By matching individual movements, mirror processing provides a representation of body part movement that might serve various functions (for example, imitation), but is devoid of any specific cognitive importance per se. p. 269

Instead, understanding comes from matching higher-order representations of the goals of the action (the quote above continues immediately):

By contrast, through matching the goal of the observed motor act with a motor act that has the same goal, the observer is able to understand what the agent is doing. p. 269

This goal-matching, according to R&S is quite independent of any specific motor act.

...among the neurons in various areas that become active during action observation, only those that can encode the goal of the motor behaviour of another individual with the greatest degree of generality can be considered to be crucial for action understanding... Indeed, parieto-frontal mirror neurons encode the goal of observed motor acts regardless of whether they are performed with the mouth, the hand or even with tools. p. 269

So, mirror neurons, those cells that fire during specific actions such as grasping-with-the-hand and while watching the same specific action -- the very cells that got everyone SO excited -- are not involved in action understanding. Rather, according to R&S, action understanding is achieved by cells that do not code for actions at all, but something higher level, goals/intentions.

It's worth noting that R&S directly contradict themselves in the sidebar definition of "Mirror-based action understanding":

The comprehension of an observed action based on the activation of a motor programme in the observer’s brain. p. 265

A motor program presumably controls a specific action, such as grasping-with-the-hand, not an action-independent goal or intention.

Is the mirror system needed for coding all goals and intentions? No, according to R&S:

This does not mean that the parieto-frontal mirror mechanism mediates all varieties of intention understanding. p. 271

But they want to say that the system IS needed for coding motor goals/intentions. They illustrate with an example:

Mary is interacting with an object (for example, a cup). According to how she is grasping the cup, we can understand why she is doing it (for example, to drink from it or to move it). This kind of understanding can be mediated by the parieto-frontal mirror mechanism by virtue of its motor chain organization. p. 271

It is true that we can make limited inferences about what Mary's intentions are by the way she grasps the cup. But (i) these inferences are underdetermined by the movement and (ii) motor experience with grasping cups is not necessary for making these inferences. Regarding (i), if Mary grasps the handle, rather than pushing the side with her fingertips, we may infer that she intends to drink rather than move. However, Mary could just as well be moving the cup, picking it up to put it in the sink, or picking it up to give to someone. Regarding (ii), simply having perceptual experience with grasping for drinking versus pushing for moving actions will result in the same inferential ability even without motor experience with cups.

Acknowledging the point that action understanding does not require the motor system, R&S site several studies that show that understanding can be achieved without the mirror system. They conclude,

These data indicate that the recognition of the motor behaviour of others can rely on the mere processing of its visual aspects. p. 270

So, to summarize:

1. Cells that mirror specific actions (i.e., congruent mirror neurons), don't support action understanding.
2. The real work of action understanding is done by cells that abstract away from actions and instead code goals and intentions.
3. Intentions can be coded outside the mirror/motor system.
4. The recognition of actions of others can be achieved outside the mirror/motor system.

So what does the mirror system contribute?

...[it] allows an individual to understand the action of others ‘from the inside’ p. 264

What does "from the inside" mean to R&S?

...the observed action is understood from the inside as a motor possibility and not just from the outside as a mere visual experience. p. 270

In other words, it is the "understanding" that I-can-do-that-too and nothing more.

Rizzolatti, G., & Sinigaglia, C. (2010). The functional role of the parieto-frontal mirror circuit: interpretations and misinterpretations Nature Reviews Neuroscience, 11 (4), 264-274 DOI: 10.1038/nrn2805

Monday, March 22, 2010

Mirror neurons support action understanding -- "from the inside"?

I think we are getting closer to understanding what mirror neurons are doing. No longer is it claimed that mirror neurons are THE basis for "action understanding". Now, according to Rizzolatti & Sinigaglia's new review (2010), there are several non-mirror mechanisms that can accomplish this:

We conclude that, although there are several mechanisms through which one can understand the behaviour of other individuals...

Mirror neurons do something else (R&S would probably prefer the term, more, but else is better I think)

the parieto-frontal mechanism is the only one that allows an individual to understand the action of others 'from the inside'

Let me explain what "from the inside" means, or at least provide an alternative to whatever R&S mean by it. In my "Eight Problems" paper (Hickok, 2009) I noted that I can understand the action of saxophone playing even though I've never performed the actions associated with saxophone playing. R&S acknowledge that, indeed, I do understand the action of saxophone playing, and do so without the benefit of mirror neurons. But they suggest my understanding is lacking something, namely that extra bit of knowledge that comes from knowing how to play a saxophone. So recognizing an action that I know how to perform = basic action understanding from non-mirror systems + mirror neuron-driven knowledge that hey I know how to make MY motor system do that!

In other words, mirror neurons support the knowledge of how to perform an action that one is observing -- that is, mirror neurons are part of the same old "how" stream that vision neuroscientists, and more recently auditory neuroscientists have been working on for more than a decade (Milner & Goodale, 1995). The "how" stream, of course, supports sensory-motor integration, or in R&S's terms, action understanding "from the inside". This is why you see motor activation during the perception of actions that you can perform: it is sensory-motor association.

Importantly, notice that there is no magical semantic knowledge that suddenly falls from heaven when we know how to perform an action. I can teach you a new word, glemph, and you can learn to reproduce it with your vocal tract so that subsequent presentations of acoustic glemph will activate your motor system by association. I could do the same with a sign language gesture. It doesn't take on meaning until you link the sensory-motor ensemble to a conceptual structure. So for example, I could define glemph (or the sign language equivalent) as 'the act of publishing on the topic of mirror neurons'. Now, the sensory-motor ensemble has meaning and you understand it, but does that mean that the meaning and the understanding is now suddenly coded or even augmented by the sensory-motor ensemble itself? Put differently, do you now have a different understanding of the concept, 'the act of publishing on the topic of mirror neurons' just because you have a new sensory-motor associate of the concept? No, you just know how to articulate the word that is associated with the concept.

I suggest that this interpretation of "understanding from the inside" explains every mirror neuron-related observation, does so more parsimoniously than Rizzolatti's account, and has more empirical support from research on aphasia and apraxia.

References

Hickok G (2009). Eight problems for the mirror neuron theory of action understanding in monkeys and humans. Journal of cognitive neuroscience, 21 (7), 1229-43 PMID: 19199415

Milner, A.D., and Goodale, M.A. (1995). The visual brain in action, Oxford: Oxford University Press.

Rizzolatti, G., & Sinigaglia, C. (2010). The functional role of the parieto-frontal mirror circuit: interpretations and misinterpretations Nature Reviews Neuroscience, 11 (4), 264-274 DOI: 10.1038/nrn2805

Friday, March 19, 2010

Mirror Neurons - The unfalsifiable theory

I recently had the pleasure of giving a lecture on mirror neurons at UC San Diego which is a very active locale for folks working on the human mirror system. I expected a lot of push-back on my critical views of mirror neurons, and I wasn't disappointed.

One of my major points of emphasis is and has been that if the mirror neuron system is really important for action understanding, then damage to action execution should result in action understanding deficits. I have pointed out that this prediction doesn't hold, either in apraxia or with more force in aphasia.

A typical response to this argument is that "the mirror system" involves lots of areas working together (George Lakoff, who was at the UCSD lecture even seemed willing to include the STS in this network) and so it is no surprise that damage to fronto-parietal areas doesn't result in the expected action understanding deficits. Along these lines, some folks pointed out that mirror responses have now been found in lots of brain areas, not just the frontal and parietal areas where they have been documented in monkeys. In other words, the mirror system is expanding.

At this point in the talk, Pat Churchland, who was my host, jumped in and said (and I paraphrase here), "Now wait a minute. If mirror neurons are all over the brain then don't they lose their explanatory power? Aren't we now just back to our old friend, the How Does the Brain Work Problem?" Very true, I think.

I ran into this same issue in my debate with Fadiga at the Neurobiology of Language Conference in Chicago. I said, here are several examples of preserved speech perception in the face of an absence of speech production ability, and the mirror neuron proponents said basically, it's a large system that's too complicated to succumb to the loss of the motor system.

A new review by Rizzolatti and Sinigaglia (2010) also pushes back against some of the critiques that have been raised lately. One concerns the observation that mirror responses in a TMS paradigm can be re-trained such that they no longer mirror and dissociated from understanding. RS: "The reason is that, in the task, there was nothing to understand: the investigated movements were meaningless." Another is the "surprising" claim (mine) that damage to the fronto-parietal motor system should be associated with deficits in action understanding. RS:" As clearly shown by electrophysiological mapping, there are motor sectors in the monkey inferior parietal lobule (and even in area PFG) with and without mirror neurons. Thus, dissociations between motor deficits and action understanding deficits can and do occur." Wait, I thought the mirror system got is magical powers from being part of the motor system. If the motor system is functionally disrupted, shouldn't the action understanding system also be disrupted? Or is there a parallel mirror motor system that can't control movement but can understand action (kind of sounds like a sensory system to me).

I think the mirror neuron folks have a serious problem on their hands: there is apparently no empirical result that can falsify the theory. If a mirror neuron shows up in an unexpected place, it is a new part of the mirror system. If a mirror neuron's activity dissociates from action understanding, it was not coding understanding at that moment. If damage to the motor system doesn't disrupt understanding, it is because that part of the motor system isn't mirroring.

Can someone from the mirror neuron camp come forward and provide us with an example of what kind of empirical result would falsify the theory? Because if you can't falsify it, it's no longer a scientific theory, it's religion.

Rizzolatti G, & Sinigaglia C (2010). The functional role of the parieto-frontal mirror circuit: interpretations and misinterpretations. Nature reviews. Neuroscience PMID: 20216547

Friday, March 12, 2010

Mis-localization of fMRI activation at the temporal-parietal boundary

In the many fMRI studies we have now done looking at sensory-motor integration for speech and related functions we have repeatedly observed that sensory-motor area Spt (i) shows up more robustly in individual subjects than in group averages and (ii) localizes to the posterior planum temporale region in individual subjects but sometimes localizes to the inferior parietal lobe in group averaged normalized data.

The results below are an example of this (data represent the work of my graduate student, Lisette Isenberg). This is a study involving 17 subjects performing a listen->covert repeat task (~shadowing) minus a listen_only condition. Data were warped to Talairach space following typical procedures and projected onto a template brain (left image below). Notice that the focus of activation is relatively small and squarely on the supramarginal gyrus. When we looked at individual subjects, however, there was no activation in the SMG but rather it was consistently ventral to this in the Sylvian fissure.

The image on the right uses ANTS (Advanced Normalization ToolS) which performs a diffeomorphic transformation to align the images based on anatomical features. The structural image you see here (right) is actually the 17 brains of the participants nicely aligned to each other. With their sulci and gyri neatly aligned, the activation in the right image is now substantially larger (functional regions are lining up better) and correctly localized to the posterior Sylvian region.

What seems to happen is that the standard template brains have a Sylvian fissure that is sometimes too flat compared to sample population, which can slope up more at the posterior boundary. Activations at the posterior margin then, as in Spt activations, end up mis-localizing to the parietal lobe. ANTS alignment seems to solve the problem though.

For this reason, anytime someone reports an SMG or inferior parietal lobe activation, I question whether it might actually be in the posterior Sylvian fissure.

Thursday, March 11, 2010

Phonemic segmentation in speech perception -- what's the evidence?

It is a commonly held belief that speech perception involves the recovery of segmental information -- that is, the speech stream is analyzed in such a way that individual phonemes are recovered. So a typical story is that we analyze the spectro-temporal features to recover phonemes which are put together to form syllables then phonological words, enabling lexical-semantic access. We've suggested, as have others, that maybe the syllable is a basic unit of analysis, while at the same time leaving open the possibility that we might also access segmental information. For example, as in this figure from Hickok & Poeppel 2007:

Or this overly simplified cartoon from Hickok 2009:

So here's the question, what exactly is the evidence that we access segmental information in perception? Do we even need phonemes for speech perception? Why?

Let me play devil's advocate and claim that we don't extract or represent phonemes at all in speech perception (production is a different story). We do it all with syllables.

Convince me that I'm wrong.

Hickok, G. & Poeppel, D. (2007). The cortical organization of speech processing. Nature Reviews Neuroscience, 8, 393-402

Hickok, G. (2009). The functional neuroanatomy of language. Physics of Life Reviews, 6, 121-143.

Tuesday, March 9, 2010

Not on mirror neurons: Who's stuff do I have to read this week?

David Gow has published a series of papers on the cortical basis of speech perception using pretty sophisticated analytic tools that do not often get applied to the type of data we are used to.

For example, in this Cognition paper: "Articulatory mediation of speech perception: A causal analysis of multi-modal imaging data" (2009) by Gow and Segawa, Granger causality analyses are used to support the Motor Theory.

And in this NeuroImage paper: "Lexical influences on speech perception: A Granger causality analysis of MEG and EEG source estimates" by Gow, Segawa, Ahlfors, and Lin (2008), top-down effects demonstrated by Granger causality analysis appear to have a lexical origin and compelling effects on phonetic perception. This is a longstanding battle in spoken word recognition, and I'm pretty enthused to see new data of this type addressing this controversy.

Common to some of David's recent work is a demonstration of the pretty compelling contribution of top-down factors in the analysis of the speech signal. One thing that is a little less obvious is why these top-down effects should have the supramarginal gyrus as a critical ingredient. It's my homework this week to work through these papers in sufficient detail to really get my head around them. I am already predisposed to the top-down part, but I do need to understand why the SMG would be the critical node. What's really impressive is the thoughtful integration of EEG, MEG and MRI data.

David, if you're reading this, it would be great to get a bit of discussion going on this issue. For example, how deeply held is your commitment to the SMG? Did you guys find any data that challenge that conclusion, or are you willing to bet something substantial?

Thursday, February 25, 2010

A chink in the mirror neuron armor?

It is yet to slow the pace of mirror neuron speculation, but at least a reference to my "Eight Problems" paper, as well as this blog, made it on the Mirror Neuron Wikipedia page (no it wasn't me who added it). It actually lists the 8 problems in the "Criticism" (singular, I note) section which is right below the section which details how mirror neurons explain understanding intentions, empathy, language, autism, theory of mind, and gender differences -- apparently the female mirror system is more robust than the male mirror neuron system. Ugh.

Tuesday, February 23, 2010

Lexical effects in speech perception

The influence of the motor system on speech perception has been getting tons of high profile attention lately and "sensorimotor theories" of speech perception are gaining popularity. For an interesting example of the such a theory, check out Jean-Luc Schwartz et al.'s, The Perception-for-Action-Control Theory (PACT): A perceptuo-motor theory of speech perception.

It is all well-and-good to understand the contribution of motor information to speech perception, but let's not forget that there is more to the brain and speech processing than the motor system. For example, there is a long history of research on lexical effects in speech perception. The Ganong (1980) effect is one: the category boundary will shift toward the lexical item in a speech continuum like gift-kift where one end of the continuum is a word. Another example comes from the phoneme restoration effect (Warren, 1970). If a speech segment is deleted in the middle of a word you can easily hear the gap. However, if that gap is replaced by a noise, the missing segment can be heard quite clearly in some cases. This effect is enhanced by lexical information (Samuel, 1981): phonemic restoration is more robust in words than nonwords and in longer words (more lexical predictability) than shorter words.

These are interesting effects that are typically interpreted as evidence for top-down modulation of lower-level auditory perception. (Note that motor effects can be explained in exactly the same way, top-down modulation; there is no need to resurrect the Motor Theory of Speech Perception.) Over the last decade or so, there is been increasing interest in identifying the neural basis of these effects. One study by Myers & Blumstein investigated the Ganong effect and another by Shahin, Bishop, & Miller investigated phonemic restoration. I love this line of work, but I'm not sure we have nailed down the best approach yet.

Myers & Blumstein used voice onset time continua involving gift-kift and a giss-kiss. They took advantage of the fact that the category boundary for these two VOT-matched continua differ because they have lexical items at opposite ends. This allowed them to compare the BOLD response in an fMRI study to same VOT matched stimuli that was either at the boundary in one continuum or in a non-boundary position in the other continuum. They reported more activity for boundary items than non-boundary items in (i) bilateral STG, (ii) L cingulate, (iii) L precentral gyrus, (iv) L mid frontal gyrus, & (v) L precuneus. They interpreted the STG activation as evidence that the lexical information influences early perceptual processes and activations in frontal/midline regions as reflections of higher order executive processes.

This conclusion seems reasonable, but I'm not sure I buy the logic that gets us there. I suppose the logic of the particular comparisons is that for ambiguous stimuli (those at the boundary) the lexical effect will be most prominent and therefore show up in the BOLD response for boundary stimuli relative to non-boundary stimuli. But one might also reason that the strongest lexical effect should be found at certain non-boundary items, namely those that are normally at the boundary but now are not at the boundary because of the lexical pull. I.e., a stimulus that used to be ambiguous is now non-ambiguous because of all the work lexical information has done to affect perception. Another possible explanation of their findings is that boundary items are more difficult to categorize and so require more executive resources (frontal/midline activations) and these executive systems in turn modulate auditory areas, e.g., by increasing attentional gain. In short, I don't think the conclusions are necessarily wrong, but I there are some questions remaining.

Shahin et al. (2009) used a pretty slick design to assess phonemic restoration effects in physically very similar stimuli in an fMRI study. Following Samuel they presented speech that either contained a gap (with noise filling the gap) or did not contain a gap (with noise superimposed over the speech segment). Subjects were asked to decide whether the stimulus was intact or contained a gap. One complication of studying phonemic restoration is that speech that contains a gap is physically different from speech that does not, so it is unclear whether any observed effects result from the illusion or the physical gap. To get around this Shahin et al. manipulated the duration of the noise burst in the stimuli such that all stimuli were right at the threshold boundary for hearing the illusion or not. This resulted in a set of highly overlapping stimuli in terms of their physical properties but a wobbly perception. They then used information about how the stimuli were actually perceived to then probe the brain response.

The primary comparisons were

(1) items that elicited an illusion (items with gaps that were perceived as intact) minus items that were intact and perceived as intact -- so both stimuli were perceived as intact but one was illusory. This contrast was assumed to identify areas involved in phonemic repair.

(2) items that elicited an illusion minus items that failed to elicit an illusion (items with gaps that were perceived as items with gaps). This contrast was assumed to identify areas that correlated with the actually perception of the illusion.

Comparison #1 resulted in activation in Broca's area (~BA44), the anterior insula bilaterally, and the left pre-SMA.
Comparison #2 resulted in activation in left angular gyrus/STS, right STS, precuneus, and bilateral superior frontal sulcus.

Both word and nonword stimuli were used and these effects were evaluated in ROI analyses. The left AG/STS showed an interaction between lexical status and perceptual condition, which the authors suggest is reflective of the use of a lexical template for filling in missing information. Broca's area and insulae also showed an interaction and further seemed to respond most robustly to illusion-failure trials within the word condition (reflecting extra work trying, but failing to repair?).

So unlike the Myers & Blumstein study, Shahin et al. do not report extensive activity in the bilateral STG (the STS activity is very posterior) but instead find "repair" activity in frontal areas and lexical effects ("template matching") in posterior STS/AG.

One complication with the AG/STS activations is that these are all sub-baseline effects (signal intensity < 0), so the differences are degrees of deactivation. One could appeal to the "default network" in explaining these patterns, but the authors argue against such an account. At the very least, the negative activation complicates the picture.

The real question here is what do these subtractions reveal. In principle, I like the idea of correlating responses with perceptual experience. But at the same time, conscious perceptual experience is a fairly high-level phenomenon, whereas many of the processes we are interested in, those down in the trenches of the processing stream that ultimately lead to perception, may be unconscious and may share computations between stimuli that are ultimately perceived one way versus another. So in the end, it is hard to know what is actually being detected and more importantly, what is not being detected.

In general, I think this is an important line of investigation and I'd like to see folks give it more attention. Who knows, it might even lead to a competitor to the sensorimotor models of speech perception: the sensory-lexical model of speech perception.

References

Ganong, W. F. (1980). Phonetic categorization in auditory word perception. Journal of Experimental Psychology: Human Perception and Performance, 6, 110-125.

Shahin, A., Bishop, C., & Miller, L. (2009). Neural mechanisms for illusory filling-in of degraded speech NeuroImage, 44 (3), 1133-1143 DOI: 10.1016/j.neuroimage.2008.09.045

Myers EB, & Blumstein SE (2008). The neural bases of the lexical effect: an fMRI investigation. Cerebral cortex (New York, N.Y. : 1991), 18 (2), 278-88 PMID: 17504782

Samuel, A.G. (1981). Phonemic restoration: Insights from a new methodology. JEP: General, 110, 474-94.

Warren, R.M. Perceptual restoration of missing speech sounds. Science, 1970, 167, 392-393

Monday, February 8, 2010

What's an "Opinion" in journal reviews?

Some journals have subcategories of reviews that include labels like "opinion" or "perspective". For example, our 2007 paper in Nature Reviews Neuroscience (Hickok, G., & Poeppel, D. (2007). The cortical organization of speech processing Nature Reviews Neuroscience, 8 (5), 393-402 DOI: 10.1038/nrn2113) appeared in the "Perspectives" section, not the "Reviews" section, and was further branded with the dreaded label, OPINION. I find it amusing how some folks use this in their citation of our work: "Hickok, G., & Poeppel, D. (2007) Opinion - The cortical organization...." Is this because they think it is part of the title? Or an attempt to cast doubt on the ideas expressed?

More to the point, what IS an opinion in a review article? Or even more to the point, what ISN'T an opinion? Unless a review article limits itself to a list of observations of the form, "BOLD signal increases at x,y,z coordinate during the presentation of x compared to the presentation of y" or "the time it took subjects to push the button corresponding to the '/ba/' response button was longer with the TMS coil ON compared to when it was off", it is an "opinion". Put differently, unless a review is just a recapitulation of the Results sections of a set of papers, the review is the opinion (interpretation) of the authors. To the extent that an interpretation represents a theoretical explanation of the observations and therefore hypotheses that can be tested, etc., "opinions" are what we should be striving for in scientific inquiry. So why single out some review articles as being "Opinion" while others qualify as "Reviews"?

The answer is that they don't really mean "opinion" because every review, indeed every discussion section, is opinion. What they really mean is "controversial" or "non-conventional" -- ideas that shake things up a bit. I think these kinds of reviews are the most interesting and more likely to have an influence on subsequent research.

So although I think labeling some reviews as "Opinion" is a silly, even unscientific, thing to do, as long as they are doing it, I would take it as a compliment.

But that's just my opinion.

Wednesday, February 3, 2010

Brodmann's Map -- 101 years old

In celebration of the centenary of the publication of Korbinian Brodmann's famous map, Karl Zilles & Katrin Amunts have just published a great little piece on its history and current influence (too bad Nature Reviews Neuroscience couldn't have brought it to press in 2009). The paper highlights some interesting tidbits, like the influence of evolutionary theory on Brodmann's work, how Brodmann's map relates to those that followed, how it lost favor and how it was given new life with the advent of functional imaging. The paper even features an interview with Korbinian himself (fictitious, of course).

Beyond the interesting historical perspective, the article underscores the pitfalls associated with over-interpreting Brodmann areas in functional imaging studies, but also emphasizes the importance of anatomy in developing models of the organization of the cerebral cortex.

Zilles K, & Amunts K (2010). Centenary of Brodmann's map - conception and fate. Nature reviews. Neuroscience, 11 (2), 139-45 PMID: 20046193

Thursday, January 28, 2010

Tonotopic organization of human auditory cortex

Former Talking Brains West grad student, Colin Humphries, in collaboration with Einat Liebenthal and Jeff Binder, has recently published the best study yet of the tonotopic organization of human auditory cortex. They found evidence of frequency sensitive gradients, but oriented differently than previous work has suggested. Definitely worth a look.

Humphries, C., Liebenthal, E., & Binder, J. (2010). Tonotopic organization of human auditory cortex NeuroImage DOI: 10.1016/j.neuroimage.2010.01.046

Tuesday, January 26, 2010

Intelligible speech and hierarchical organization of auditory cortex

It has been suggested that auditory cortex is hierarchically organized with the highest levels of this hierarchy, for speech processing anyway, located in left anterior temporal cortex (Rauschecker & Scott, 2009; Scott et al., 2000). Evidence for this view comes from PET and fMRI studies which contrast intelligible speech with unintelligible speech and find a prominent focus of activity in the left anterior temporal lobe (Scott et al., 2000). Intelligible speech (typically sentences) has included clear speech and noise vocoded variants which are acoustically different but both intelligible, whereas unintelligible speech has included spectrally rotated versions of these stimuli. The idea is that regions that respond to the intelligible conditions are exhibiting acoustic invariance, i.e., responding to the higher-order categorical information (phonemes, words) and therefore reflect high levels in the auditory hierarchy.

However, the anterior focus of activation contradicts lesion evidence which shows that damage to posterior temporal lobe regions is most predictive of auditory comprehension deficits in aphasia. Consequently, we have argued that the anterior temporal lobe activity in these studies is more a reflection of the fact that subjects are comprehending sentences -- which are known to activate anterior temporal regions more than words alone -- than intelligibility of speech sounds and/or words (Hickok & Poeppel, 2004, 2007). Therefore, our claim has been that the top of the auditory hierarchy for speech (regions involved in phonemic level processes) is more posterior.

To assess this hypothesis we fully replicated previous intelligibility studies using two intelligible conditions, clear sentences and noise vocoded sentences, and two unintelligible conditions, rotated versions of these. But instead of using standard univariate methods to examine the neural response, we used multivariate pattern analysis (MVPA) to assess regional sensitivity to acoustic variation within and across intelligibility manipulations.

We did perform the usual general linear model subtractions: intelligible [(clear + noise vocoded) - (rotated + rotated noise vocoded)] and found robust activity in the left anterior superior temporal sulcus (STS), but also in the left posterior STS, and right anterior and posterior STS. This finding shows that intelligible speech activity is not restricted to anterior areas, or even the left hemisphere. A broader bilateral network is involved.

Next we examined the pattern of response in various activated regions using MVPA. MVPA looks at the pattern of activity within a region rather than the pooled amplitude of the region as a whole. If different patterns of activity can be reliably demonstrated in a region, this is an indication that the manipulated features (e.g., acoustic variation in our case) are being coded or processed differently within the region.

The first thing we looked at was whether the pattern of activity in and immediately surrounding Heschl's gyrus was sensitive to intelligibility and/or acoustic variation. This is actually an important prerequisite for claiming acoustic invariance, and therefore higher-order processing, in downstream auditory areas: If you want to claim that invariance to acoustic features downstream reflects higher levels of processing in the cortical hierarchy, you need to show that earlier auditory areas are sensitive to these same acoustic features. So we defined early auditory cortex independently using a localizer scan, AM noise modulated at 8Hz relative to scanner noise. The figure below shows the location of this ROI (roughly that is, as this is a group image and for all MVPA analyses ROIs are defined in individual subjects) and the average BOLD amplitude to the various speech conditions. Notice that we see similar levels of activity for all conditions, especially clear speech and rotated speech which appear to yield identical responses in Heschl's gyrus. This seems to provide evidence that rotated speech is indeed a good acoustic control for speech.

However, using MVPA, we found that the pattern of activity in Heschl's gyrus (HG) could easily distinguish clear speech from rotated speech (it is responding to these conditions differently). In fact, HG could distinguish each condition from the other, including the within intelligibility contrasts such as clear vs. noise vocoded (both intelligible) and rotated vs. rotated noise vocoded (both unintelligible). It appears that HG is sensitive to the acoustic variation between our conditions. The figure below shows classification accuracy for the various MVPA contrasts in left and right HG. The dark black line indicates chance performance (50%) whereas the thinner line indicates the upper bound of the 95% confidence interval determined via a bootstrapping method.

Again this highlights the fact that standard GLM analyses obscure a lot of information that is contained in those areas that appear to be insensitive the manipulations we impose.

So what about the STS? Here we defined ROIs in each subject using the clear minus rotated condition, i.e., the conditions that showed no difference in average amplitude in HG. ROIs where anatomically categorized in each subject as being "anterior" (anterior to HG), "middle" (lateral to HG), or "posterior" (posterior to HG). In a majority of subjects, we found peaks in anterior and posterior STS in the left hemisphere (but not in the mid STS), and peaks in the anterior, middle, and posterior STS in the right hemisphere. ROIs were defined using half of our data, MVPA was performed using the other half -- this ensured complete statistical independence.

Here are the classification accuracy graphs for each of the ROIs. The left two bars in each graph show across-intelligibility contrasts (clear vs. rotated & noise vocoded vs. rotated NV). These comparisons should classify if the area is sensitive to the difference in intelligibility. The right two bars show within-intelligibility contrasts (clear vs. NV, both intell; rot vs. rotNV, both unintell). These comparisons should NOT classify if the ROI is acoustically invariant.

Looking first at the left hemisphere ROIs, notice that both anterior and posterior regions classify the across intelligibility contrasts (as expected). But the anterior ROI also classifies clear vs. noise vocoded, two intelligible conditions. The posterior ROI does not classify either of the within intelligibility contrasts. This suggests that the posterior ROI is the more acoustically invariant region.

The right hemisphere shows a different pattern in this analysis. The right anterior ROI shows a pattern that is acoustically invariant whereas the mid and posterior ROIs classify everything, every which way, more like HG.

If you look at the overall pattern within the graphs across areas you'll notice a problem with the above characterization of the data. It categorizes a contrast as classifying or not and doesn't take into account the magnitude of the effects. For example, notice that as one moves from aSTS to mSTS in the right hemisphere, classification accuracy for the across intelligibility contrasts rises (as it does in the left hemi), and that in the right aSTS clear vs. NV just misses significance, where as in the mSTS clear vs. NV barely passes significance. We may be dealing with thresholding effects. This suggests that we need a better way of characterizing acoustic invariance that uses all of the data.

So what we did is calculate an "acoustic invariance index" which basically measures the magnitude of the intelligibility effect (left two bars compared with right two bars). This difference should be large if an area is coding features relevant to intelligibility. This measure was then corrected by the "acoustic effect" (the sum of the absolute difference in classification accuracy within intelligibility conditions). When you do this, here is what you get (acoustic invariance = positive values, range -1 to 1):

HG is the most sensitive to acoustic variation across conditions and more posterior areas (pSTS in left, mSTS in right) are the least sensitive to acoustic variation. aSTS fall in between these extremes. So left pSTS and right mSTS as we've defined it anatomically appear to be functionally homologous and represent the top of the auditory hierarchy for phoneme-level processing. I don't know what is going on in right pSTS.

What features are these areas sensitive to? My guess is that HG is sensitive to any number of acoustic features within the signals, aSTS is sensitive to suprasegmental prosodic features, and pSTS is sensitive to phoneme level features. Arguments for these ideas are provided in the manuscript.

References

Okada, K., Rong, F., Venezia, J., Matchin, W., Hsieh, I., Saberi, K., Serences, J., & Hickok, G. (2010). Hierarchical Organization of Human Auditory Cortex: Evidence from Acoustic Invariance in the Response to Intelligible Speech Cerebral Cortex DOI: 10.1093/cercor/bhp318

Hickok, G., & Poeppel, D. (2004). Dorsal and ventral streams: A framework for understanding aspects of the functional anatomy of language. Cognition, 92, 67-99.

Hickok, G., & Poeppel, D. (2007). The cortical organization of speech processing. Nat Rev Neurosci, 8(5), 393-402.

Rauschecker, J. P., & Scott, S. K. (2009). Maps and streams in the auditory cortex: nonhuman primates illuminate human speech processing. Nat Neurosci, 12(6), 718-724.

Scott, S. K., Blank, C. C., Rosen, S., & Wise, R. J. S. (2000). Identification of a pathway for intelligible speech in the left temporal lobe. Brain, 123, 2400-2406.

Thursday, January 21, 2010

Disentangling syntax and intelligibility -- Or how to disprove two theories with one experiment

I both love and hate a recent paper by Angela Friederici, Sonja Kotz, Sophie Scott, & Jonas Obleser titled Disentangling syntax and intelligibility in auditory language comprehension. The paper is in the "Early View" section of Human Brain Mapping.

Here's why I love it. There are a number of claims in the literature on the neuroscience of language that I disagree with. One is Sophie Scott's claim that speech recognition is a left hemisphere function that primarily involves anterior temporal regions. Another is Angela Friederici's claim that a portion of Broca's area, BA44, is critical for "hierarchical structure processing". In the study reported in this new paper, Friederici and Scott have teamed up and proven both of these claims to be incorrect. This I like.

What I hate about the paper is that the authors don't seem to recognize that their new data provide strong evidence against their previous claims, and in fact argue that it supports their view(s).

So what did they do? The experiment is a nice combination of the intelligibility studies that Scott has published and the syntactic processing studies that come out of Friederici's lab. It was a 2x2 design: grammatical sentences versus ungrammatical sentences x intelligible versus unintelligible (spectrally rotated) sentences.

What did they find? The intelligible minus unintelligible contrast showed bilateral activation up and down the length of the STG/STS, i.e., not just in the left hemisphere and not just anterior to Heschl's gyrus. This contradicts previous studies from Scott's group, particularly with respect to the right hemisphere activation, as the current paper correctly pointed out:

...the right-hemispheric activation in response to increasingly intelligible speech deviates from the original papers on intelligibility [Narain et al., 2003; Scott et al., 2000]. (p. 6)

In short, the primary bit of data that has been driving claims for a left anterior pathway for intelligible speech has been shown to be inaccurate. This is not terribly surprising as those previous studies were severely under powered.

Conclusion #1: the "pathway for intelligible speech" is bilateral and involves both anterior and more posterior portions of the STS/STG.

What about Broca's area and hierarchical structure building? In fairness, most of the paper was about the STG/STS and not about Broca's area, but the role of Broca's area was addressed and of course it is perfectly fair to use data from this study to address a hypothesis proposed by Friederici in other papers. If Broca's area is involved in hierarchical structure building, then it should activate during the comprehension of sentences, which surely are hierarchically structured. Thus, the intelligible (structured) minus unintelligible (unstructured) contrast should result in activation of Broca's area. Yet it did not. The contrast between intelligible and unintelligible sentences resulted only in activation in the superior temporal lobes.

Conclusion #2: Hierarchical structure building can be achieved without Broca's area involvement.

So in light of these findings, how does one maintain the view that intelligible speech primarily involves the left hemisphere and that syntactic (hierarchical) processing involves Broca's area? It all hinges on the response to those pesky ungrammatical sentences.

Here's the assumption on which their argument relies: syntactic processing is really only revealed during the processing of ungrammatical sentences. They don't state it in these terms, but this is what you have to assume for their arguments to work. Right off the bat we have a problem with this assumption. When you listen to an ungrammatical sentence, not only does this mess up syntactic processing, but it also increases the load on semantic integrative processes and who knows what other meta-cognitive processes are invoked by hearing a sentence like, "The pizza was in the eaten", which is an example of the kind of violation they used. In fact, one might even argue that processing an ungrammatical sentence causes the syntactic processing mechanism to shut down and instead crank up cognitive interpretation strategies. Thus rather than highlighting syntax, such a manipulation may highlight non-syntactic comprehension strategies!

So what happens when you listen to ungrammatical sentences and spectrally rotated ungrammatical sentences?

Ungrammatical sentences minus grammatical sentences (intelligible only) resulted in activation the left and right superior temporal lobe, Broca's area (left BA 44), and the left thalamus. So the "syntactic" effect is bilateral in the superior temporal lobe, but at least we now have Broca's area active.

The authors then took these seven ROIs defined in the two main contrasts (intell-unintell and gramm-ungramm), extracted percent signal change around the peaks and performed subsequent ANOVAs to assess interactions. These interactions are what really drives their argument. However, we now have another problem, namely that the data that defined the ROIs is not independent of the data that were subsequently analyzed using ANOVAs. We therefore can't be sure the reported effects are valid. Nonetheless, let's pretend they are see if the conclusions make sense.

Here is a graph of the interactions:

The claim here is that "syntax" (i.e., greater response to ungrammatical) and intelligibility (i.e., greater response to intelligible) significantly interacted only in the left hemisphere ROIs, and indeed in all of them, including BA 44 and the thalamus. Therefore these regions represent the critical network, according the Friederici et al., because they are responding to the syntactic features in intelligible speech and not merely acoustic differences which are present in the unintelligible speech as well. Something is very wrong with this logic even beyond the possible invalid assumption and analysis methods noted above.

Consider the response pattern in BA 44. Zero response to normal syntactically structured sentences (which presumably requires some degree of syntactic processing), significant activation to intelligible ungrammatical sentences, significant (or so it seems) activation to UNINTELLIGIBLE versions of grammatical sentences, and no activation to unintelligible versions of ungrammatical sentences. What possible syntactic computation could be invoked BOTH by a grammatical violation and unintelligible noises but not by grammatical sentences? And this pattern is considered part of the intelligible speech/syntactic processing system whereas the right anterior STS, which shows a very robust intelligibility effect and no obvious effect of violation is not. I would suggest instead that because the right STS area is actually responding to sentences and not just broken sentences or spectrotemporal noise patterns that the right STS is more likely involved in sentence processing.

In the end, Friederici et al.'s entire argument rests on (i) a possibly invalid assumption about their "syntactic" manipulation, (ii) a possibly contaminated statistical analysis, and (iii) a logically questionable definition of what counts as a region involved in the processing of these language stimuli.

The basic findings are extremely important though because they confirm that speech recognition and now the "pathway for intelligible speech" is bilateral and that Broca's area is silent during normal sentence comprehension and therefore is not involved in basic syntactic/hierarchical structure building.

References

Friederici AD, Kotz SA, Scott SK, & Obleser J (2009). Disentangling syntax and intelligibility in auditory language comprehension. Human brain mapping PMID: 19718654

Narain, C., Scott, S. K., Wise, R. J., Rosen, S., Leff, A., Iversen, S. D., & Matthews, P. M. (2003). Defining a left-lateralized response specific to intelligible speech using fMRI. Cereb Cortex, 13(12), 1362-1368.

Scott, S. K., Blank, C. C., Rosen, S., & Wise, R. J. S. (2000). Identification of a pathway for intelligible speech in the left temporal lobe. Brain, 123, 2400-2406.

Scott, S. K., & Wise, R. J. (2004). The functional neuroanatomy of prelexical processing in speech perception. Cognition, 92(1-2), 13-45.

Thursday, January 14, 2010

At the Frontiers of Neuro-Aphasiology: Two papers by Julius Fridriksson

The following is a guest post by Whitney Anne Postman-Caucheteux.

If there were a sub-field of Aphasiology devoted to imaging of the brains of people with aphasia, let’s call it “Neuro-Aphasiology”, then Dr. Julius Fridriksson would be one of its most distinguished pioneers. As a long-time admirer of his work, I can think of few other researchers of aphasia who have gone beyond simply talking about issues such as age, task difficulty, and perfusion in neuroimaging of language processes in people with aphasia, to actually conducting and publishing the foundational research (see Fridriksson et al 2006, 2005, 2002, among others).

With two recently published aphasia fMRI papers, Dr. Fridriksson and his team at the University of South Carolina have done it again, by combining advanced fMRI techniques for acquiring overt speech responses with sophisticated psycholinguistic analyses of word production in aphasia. I propose that these two papers should be read as a pair, since each provides complementary investigations of the contributions of perilesional and contralesional regions to language production in chronic aphasia:

“F1”: Fridriksson, J., Baker, J.M. & Moser, D. (2009). Cortical mapping of naming errors in aphasia. Human Brain Mapping, 30, 2487-2498.

“F2”: Fridriksson, J., Bonilha, L., Baker, J.M., Moser, D., & Rorden, C. (in press). Activity in preserved left hemisphere regions predicts anomia severity in aphasia. Cerebral Cortex.

Both papers (hence, “F1” and “F2”) describe Fridriksson et al’s overt picture-naming experiments with chronic stroke patients with aphasia using fMRI. F1 is perhaps the more revolutionary of the two, in being the first to link certain patterns of neural activation in such patients with specific error types that have long been the subject of psycholinguistic investigations (e.g., Schwartz et al, 2006). I will review F1 and F2 in turn before offering suggestions on how they complement each other in providing clues to different pieces of the puzzle of language production in post-stroke aphasia.

Part I on F1: Fridriksson, J., Baker, J.M. & Moser, D. (2009). Cortical mapping of naming errors in aphasia. Human Brain Mapping, 30, 2487-2498.
In F1, Fridriksson et al employed a sparse sampling technique to acquire overt naming responses to object pictures from 11 stroke patients with various types of aphasia and with a range of degrees of anomia severity, all in chronic stages. Their goal was to identify common areas of activation across the entire cohort associated with 1) accurate picture-naming, 2) phonemic errors, and 3) semantic errors. This goal is crucial to understanding the neural substrate of disordered language production post-stroke, and a valuable use of novel techniques for acquisition of overt speech with fMRI. Dr. Bruce Crosson and colleagues have elegantly outlined their recommendations for how best to acquire, analyze and interpret fMRI data from language production tasks with patients with aphasia in Crosson et al (2007). In addition to the familiar complications of having stroke patients with aphasia participate in fMRI studies, acquisition of overt speech responses from such patients during scanning can be confounded by related motor-speech disorders such as apraxia, and the possibility of extremely long response times for some patients.

Prior fMRI research using silent production could not distinguish between activation patterns associated with accurate and inaccurate responses. Distinctive patterns are to be expected, given results from studies linking superior language production performance (measured outside the scanner with fMRI, during scanning with PET) with predominantly perilesional activation, and inferior performance with increased contralesional involvement (see references in F1 and F2). Consequently, studies like F1 are needed to discover how neural patterns may differ for accurate and inaccurate naming. Such discoveries can clarify effective vs. ineffective ways in which neural systems respond to damage, and subsequently, how these ways can be enhanced or suppressed with treatment.

Focusing on areas of activation common to the cohort, Fridriksson et al masked out voxels lesioned in any of the 11 patients, extending over the greater portion of the left hemisphere. The patients achieved a wide range of accuracy, and committed semantic errors, phonemic errors, unrelated errors, neologisms or omissions (see Table II in F1 reproduced and modified below).

The authors correlated the patients’ correct responses and semantic and phonemic errors with increases in BOLD activation in neural regions outside of the aforementioned mask, yielding the following intriguing results:

Result #1: Correct names correlated positively with increases in BOLD response in right inferior frontal gyrus.

Result #2: Phonemic errors correlated positively with increases in BOLD response in left precuneus (BA19), cuneus (BA7) and posterior inferior temporal gyrus (BA37).

Result #3: Semantic errors correlated positively with increases in BOLD response in right cuneus (BA 18), middle occipital gyrus (BA 18/19), and posterior inferior temporal gyrus (BA37).

The first result linking naming accuracy to activation in the right IFG is corroborated by some previous research suggesting positive contributions of this region to successful language production (see references in F1). It is also ostensibly in contradiction with the principal results of F2, but more on that issue in Part III. Here I’d like to concentrate on the top graph in Figure 3:

The tight correlation between accuracy and right IFG activation is striking. Yet even so, it is worth mentioning that for 4 patients, virtually 0% or less than .1% increase in BOLD amplitude was coupled with naming accuracy, raising the question of whether this result was carried largely by a subset of the cohort. If this were indeed the case, it might help to elucidate which patients are expected to show substantial right IFG activation linked to accuracy. This issue was raised in Postman-Caucheteux et al (in press), in a discussion of important case studies by Meinzer et al (2006) and Vitali et al (2007).
The novel results coupling phonemic errors with ipsilesional posterior activation, and semantic errors with contralesional posterior activation, should inspire future research directed at replicating and developing them further. The explanations offered by the authors for why these regions should be involved in the production of these errors are plausible and appealing. Especially interesting was their finding that the neural activation patterns linked to each of these patterns were essentially additional to that observed for correct naming. That is, they both involved the same neural substrate as accuracy, plus activation in the aforementioned posterior areas. This finding is in agreement with that for incorrect vs. correct naming in Postman-Caucheteux et al (in press), although we found a link between right frontal, not posterior, regions for semantic paraphasias (as well as omissions). However, the patients in our smaller cohort had frontal-insular-parietal damage, with almost no temporal damage. Since the brain region most affected in the F1 cohort was posterior temporal, this comparison raises the possibility that semantic errors may principally involve directly contralesional activation, i.e., activation in right frontal regions in patients with left frontal lesions, and in right posterior regions in patients with left posterior lesions. A worthwhile approach to investigate this possibility would take into account the precise nature of the semantic errors, which brings me to my next point.
More qualitative details (including examples) from all of the error types, and measurement of reaction times as an index of naming difficulty, would have been informative. I would also like to know if the high number of unrelated errors produced by P3 were perseverations, which may constitute their own special class of errors. Likewise, more information on the types of semantic errors could have been used to support the authors’ interpretation of right posterior activation as representing less specific semantic representations (p.2496). Furthermore, research on the evolution of neologisms into phonemic paraphasias (Bose & Buchanan, 2007) implies that comparison of possible neural patterns for neologisms with those found for phonemic errors could have been instructive.

The grounds for the authors’ exclusion of other types of errors in their analyses are somewhat unclear, for even though phonemic and semantic errors were the most frequent types, the other types of errors were not infrequently produced by certain patients. With regard to omissions, even though the interpretation of omissions is indeed problematical, nevertheless they are routinely tracked as errors, and they can be predicted by specific psycholinguistic factors such as semantic competition (Schnur et al, 2006). Since the authors did not include an analysis of factors that could have contributed to each type of error (e.g., percent name agreement, age of acquisition, target word length), it is unknown whether certain stimuli were consistently more likely to induce errors, as was found in Postman-Caucheteux et al (in press). Given that the effects of different psycholinguistic variables on picture-naming have been linked to specified neural areas of activation (Schnur et al 2009, Wilson et al 2009), future studies inspired by F1 should seek to isolate the variables that induce errors, perhaps by manipulating stimuli according to factors of interest and measuring reaction times. This approach may be helpful for interpreting the nature of error-linked activation.

As corroboration of their findings in F1, the authors cite a clever treatment study of word learning using PET by Raboyeau et al (2008). Chronic stroke patients with aphasia were trained to produce names of objects that had been difficult for them prior to therapy. At the same time, healthy participants were trained to produce words in second languages that they had acquired in school with varying degrees of proficiency. Raboyeau et al’s findings of increased right insular and frontal activation with word learning in both groups are interpreted in F1 along these lines:

“[They] concluded that increased activity in the right frontal lobe in aphasia is not merely the consequence of damaged homologues in the left hemisphere but, rather, is a reflection of increased reliance on the right hemisphere to support aphasia recovery” (p. 2496).

Raboyeau et al’s findings may actually be trickier to explain, as they also included more activation in left frontal regions (BA’s 10 and 11) in the patients but not the controls. Additional left hemisphere activation may have been present in some patients but, as with F1, lesioned voxels were excluded from their analyses. Here is how Raboyeau et al state their own conclusions:

“[...] Activations observed in these two right frontal regions do not seem to play a true compensatory function in aphasia (italics mine), and do not represent a mere consequence of left hemispheric lesion, as they existed also in non–brain-damaged subjects [...]” (p.296).

So as I understand their discussion, they did not infer that right insular and frontal activation supported recovery, as suggested in F1. Rather, their results indicated greater effort and cued word retrieval as a result of training, in patients as well as controls. If I have misconstrued Raboyeau et al’s and F1’s conclusions, hopefully someone will help me see the light.

Part II on F2: Fridriksson, J., Bonilha, L., Baker, J.M., Moser, D., & Rorden, C. (in press). Activity in preserved left hemisphere regions predicts anomia severity in aphasia. Cerebral Cortex.

While the fMRI task and methods of acquisition for F2 appear to be almost identical to those in F1, the leading question and analytical techniques were virtually the converse. Instead of focusing on patients’ errors as in F1, here the authors asked which brain regions appear to support accurate overt picture naming. Also, instead of analyzing the cohort of patients as a group and creating a group lesion mask, here the authors examined the patients (N=15) individually and compared each one’s activation map to the average from an equal number of healthy age-matched control participants.

Since here I could not do justice to their advanced methods, readers are referred to the original paper for details on the complex steps involved in comparing activation maps derived from each patient’s contrast of correct picture naming vs. abstract (silent) picture viewing, with the controls’ group map. In essence, the degree to which each patient’s activation map deviated from the average control map was correlated with their proportion of correct naming responses. In addition, structural analyses were conducted to investigate whether intensity of activation associated with correct naming was dependent upon specific areas of damage. Results were:

1) In the control group, picture-naming was supported by bilateral activation in posterior regions (cuneus & inferior/middle occipital gyrus (BA 18), middle temporal gyrus (BA37)) but highly left lateralized in the transverse (BA 42) and superior (BA 22) temporal gyri, and frontally in inferior frontal gyrus (BA 45), middle frontal gyrus (BA 10, 11, 47), and anterior cingulate (BA 32).

2) For the 15 participants with aphasia due to left-hemisphere stroke, accurate picture naming was supported by many of the same left-lateralized regions observed in the control group. Most of these were perilesional, namely, medial & middle frontal gyrus (BA 10, 11, 47) and inferior occipital gyrus (18). The left anterior cingulate gyrus (BA 32) was also linked to correct naming in patients, but was considered too medial to qualify as perilesional for this cohort of patients.

3) In the patients, intensity of activation in these left-lateralized areas correlated with number of correct names. Here’s the money shot (Figure 3 in F2), showing cortical areas associated with naming task performance (red-yellow scale) along with the lesion overlay map for all 15 patients (blue-green scale):

4) But Fridriksson et al didn’t stop there. Even more dazzling, those patients who did best on the naming task in the scanner tended to show greater activation than the controls in the regions highlighted above (red-yellow), and those who did less well on the naming task showed less activation than the controls in the same areas. Figure 4 in F2 is copied below, showing “the relationship between intensity of activation (x-axis; measured in Z-scores compared with a group of normal control participants) and the number of correct naming attempts (y-axis; out of 80 pictures) during fMRI scanning”:

5) A final intriguing result: Intensity of activation in the patients was inversely correlated with damage to the posterior left IFG (BA 44).

The findings in 2) are corroborated by those in Postman-Caucheteux et al (in press), showing predominantly left perilesional activation for accurate picture-naming in patients with frontal-insular-parietal damage. To my knowledge, the findings in 3) and 4) provide the most precise characterization of ipsilesional (including perilesional and non-perilesional) areas of activation for language production in aphasia, and the most direct link between this activation and production performance, yet to be discovered.

Part 3: Sum Total of F1 + F2
F2 contributes to the mounting evidence from the nascent wave of overt speech fMRI studies, for the fundamental importance of restoration or re-integration of perilesional tissue for good language production in people with chronic aphasia due to stroke (see references in Fridriksson et al (in press) and Postman-Caucheteux et al (in press)). So I couldn’t help but wonder, how would the authors relate these findings to their earlier paper (F1), in which they found a positive role for the right IFG in patients’ accurate naming? As I understand them, the results of F1 and F2 raise the following possibilities:

1) Contralesional (right) IFG may be working in tandem with perilesional regions (and perhaps also ipsilesional non-perilesional regions such as anterior cingulate) to achieve accurate naming in some patients. Thus in F1, some patients who showed increasing right IFG involvement with increasing naming success could have also shown substantial perilesional involvement that was not observable due to the group lesion mask. If this were the case, then it would provide evidence for partnership, rather than competition, between frontal areas of both stroke and non-stroke hemispheres.

2) In some patients, contralesional activation may be so negligible in comparison with robust perilesional activation that it only becomes apparent when large portions of the left hemisphere are masked out by group lesion analyses, as in F1. Presumably, some of the patients in F2 could have also shown right IFG involvement in successful naming, but the intensity may have been too minimal relative to ipsilesional areas to be reliably detected.

I’d like to propose that contralesional IFG activation might be helpful for good language production jointly with ipsilesional areas and up to a certain relatively low threshold. When it exceeds this threshold, it might constitute over-activation that is not effective, may be more evident for naming errors (as observed in Postman-Caucheteux et al, in press), and may even interfere with functioning of ipsilesional areas (Martin et al, 2009).

In the two studies reviewed here, Dr. Fridriksson’s team found contributions of perilesional and contralesional activation to language production in post-stroke aphasia. A major step forward in disentangling these contributions has been achieved with their identification of areas involved in certain types of naming errors, signaling the way for future fMRI studies to appreciate the details of patients’ production performance. Moreover, they have deepened our understanding of the tight link between activation in certain ipsilesional areas and successful overt word production. To continue progressing in the direction led by Fridriksson et al, recognition of functional partnership between stroke and non-stroke hemispheres, and distinction between effective activation and ineffective/maladaptive over-activation of contralesional areas, may be helpful in future investigations and discussions.

Footnotes

1. The row indicating categories of nonfluent and fluent participants was added here. It does not appear in the original Table II in F1, p.2492.
2. The statistical methods employed by Fridriksson et al were much more sophisticated than mere correlation, but they will not be described in depth here.

References

Bose, A., & Buchanan, L. (2007). A cognitive and psycholinguistic investigation of neologisms. Aphasiology, 21, 726-738.

Crosson, B., McGregor, K., Gopinath, K.S., Conway, T.W., Benjamin, M., Chang, Y.L., et al. (2007). Functional MRI of language in aphasia: A review of the literature and the methodological challenges. Neuropsychology Review, 17, 157–177.

Fridriksson, J., Baker, J.M. & Moser, D. (2009). Cortical mapping of naming errors in aphasia. Human Brain Mapping, 30, 2487-2498.

Fridriksson, J., Bonilha, L., Baker, J.M., Moser, D., & Rorden, C. (in press). Activity in preserved left hemisphere regions predicts anomia severity in aphasia. Cerebral Cortex.

Fridriksson, J., Morrow, K. L., Moser, D., & Baylis, G. C. (2006). Age-related variability in cortical activity during language processing. Journal of Speech, Language, and Hearing Research, 49, 690–697.

Fridriksson, J., & Morrow, L. (2005). Cortical activation and language task difficulty in aphasia. Aphasiology, 19, 239–250.

Fridriksson, J., Holland, A.L., Coull, B.M., Plante, E., Trouard, T.P., & Beeson, P. (2002). Aphasia severity: Association with cerebral perfusion and diffusion. Aphasiology, 16, 859-871.

Martin, P.I., Naeser, M.A., Ho, M., Doron, K.W., Kurland, J., Kaplan, J., et al, (2009). Overt naming fMRI pre- and post-TMS: Two nonfluent aphasia patients, with and without improved naming post-TMS. Brain and Language, 111, 20-35.

Meinzer, M., Flaisch, T., Obleser, J., Assadollahi, R., Djundja, D., Barthel, G., et al. (2006). Brain regions essential for improved lexical access in an aged aphasic patient: A case report. BMC Neurology, 17, 6–28.

Postman-Caucheteux, W.A., Birn, R.M., Pursley, R.H., Butman, J.A., Solomon, J.M., Picchioni, D., McArdle, J., & Braun, A.R. (in press). Single-trial fMRI shows contralesional activity linked to overt naming errors in chronic aphasic patients. Journal of Cognitive Neuroscience.

Raboyeau, G., De Boissezon, X., Marie, N., Balduyck, S., Puel, M., Bézy, C., et al. (2008). Right hemisphere activation in recovery from aphasia: Lesion effect or function recruitment? Neurology, 70, 290–298.

Schnur, T. T., Schwartz, M. F., Brecher, A., & Hodgson, C. (2006). Semantic interference during blocked-cyclic naming. Evidence from aphasia. Journal of Memory and Language, 54, 199–227.

Schnur, T.T., Schwartz, M.F., Kimberg, D.Y., Hirshorn, E., Coslett, H.B., & Thompson-Schill, S.L. (2009). Localizing interference during naming: Convergent neuroimaging and neuropsychological evidence for the function of Broca's area. Proceedings of the National Academy of Sciences, 106, 322-327.

Schwartz, M.F., Dell, G.S., Martin, N., Gahl, S., & Sobel, P. (2006). A case-series test of the interactive two-step model of lexical access: Evidence from picture naming. Journal of Memory and Language, 54, 228-64.

Vitali, P., Abutalebi, J., Tettamanti, M., Danna, M., Ansaldo, A.-I.,
Perani, D., et al. (2007). Training-induced brain remapping in chronic aphasia: A pilot study. Neurorehabilitation and Neural Repair, 21, 152–160.

Wilson, S.M., Isenberg, A.L., & Hickok, G. (2009). Neural correlates of word production stages delineated by parametric modulation of psycholinguistic variables. Human Brain Mapping, 30, 3596-3608.

Greg Hickok is Professor of Cognitive Sciences at UC Irvine, Editor-in-Chief of Psychonomic Bulletin & Review, and author of The Myth of Mirror Neurons. David Poeppel, after several years as Professor of Linguistics and Biology at the University of Maryland, College Park, is now Professor of Psychology at NYU. Hickok and Poeppel first crossed paths in 1991 at MIT in the McDonnell-Pew Center for Cognitive Neuroscience where Hickok was a post doc, and Poeppel a grad student. Meeting up again a few years later at a Cognitive Neuroscience Society Meeting in San Francisco, they began a collaboration aimed at developing an integrated model of the functional anatomy of language. Research in both the Hickok and Poeppel labs is supported by NIDCD.