Wednesday, May 21, 2008

Gesture discrimination in patients with limb apraxia -- evidence for the mirror system?

As hinted at previously, the recent paper by Pazzaglia et al. (2008, J. Neuroscience, 28:3030-41) provides the best evidence I've seen in support of the mirror neuron theory of action understanding. This is a very nice paper, and a brilliant effort to assess the neural basis of gesture discrimination deficits. Unfortunately, there are some complications.

Here's what makes the paper look strong:

Subjects: 41 CVA patients were studied, 33 with left brain damage (LBD) and 8 with right brain damage (RBD). 21 of the LBD subjects were classified as having limb apraxia on assessment.

Stims and task: Subjects viewed video clips of an actor performing transitive or intransitive meaningful gestures. These gestures were performed either correctly or incorrectly. So for example, a correct transitive gesture might show the actor strumming a guitar (with an actual guitar), while the incorrect gesture would show the actor strumming a flute (a semantically related foil), or a broom (an unrelated foil). Subjects made correct/incorrect judgments on each trial.

Results: Lesion analysis revealed some amazing-looking data. This data looks so good, that I thought we'd have to change the blog title to Talking Mirror Neurons! Check it out:

Panel A shows the lesion distribution of patients with limb apraxia (LA+) on the left, and without limb apraxia (LA-) on the right. Notice that the LA+ patients have lesions centered around in the inferior frontal gyrus and inferior parietal lobe.

Panel B shows the subtraction of LA+ vs. LA- which highlights the involvement of posterior frontal areas and inferior parietal areas in LA+ patients. This is consistent with much previous data. So far, so good.

Next they analyzed gesture discrimination performance in a variety of ways.

1. LA+ patients performed significantly worse on gesture discrimination than LA- patients.

2. Significant positive correlations were found between measures of gesture production and gesture discrimination "demonstrating a clear relationship between performing and understanding meaningful gestures" (p. 3034).

3. A cluster analysis was performed on the LA+ patients based on their discrimination performance. This analysis revealed that not all LA+ patients had gesture discrimination deficits, however. In fact seven of the 21 LA+ patients were classified as having no deficit in understanding gestures. So while there is a correlation between production and understanding, these abilities do dissociation as much previous work has shown.

Now on to the really amazing data -- the lesion analysis comparing the LA+ patients with vs. without gesture recognition deficits (+GRD vs. -GRD):

Panel A shows the lesion distribution of these two groups, and Panel B shows the subtraction. It couldn't come out cleaner: patients with limb apraxia and gesture discrimination deficits (LA+[GRD+]) have lesions primarily affecting the inferior posterior frontal gyrus, whereas patients with limp apraxia but without gesture discrimination deficits (LA+[GRD-]) have lesions affecting the posterior parietal lobe.

Further, voxel-based lesion-symptom mapping analysis in all 33 LBD patients revealed that performance on the gesture discrimination task correlated with damage to voxels in the inferior frontal gyrus:

Wow! Mirror neurons rule! Is this solid evidence for the role of the posterior IFG in gesture understanding or what!

Maybe "what." Here's two reasons to be suspicious of the findings.

1. The task effectively used a signal detection type paradigm: view a stimulus and decide whether it is signal (correct gesture) or noise (incorrect gesture). The proper way to analyze such data is to calculate d-prime, as this provides an unbiased measure of discriminability. Unfortunately, most of the behavioral analyses, and all of the lesion analyses used uncorrected error rates: a score of 1 was given for hits and for correct rejections, and zeros were given for false alarms and misses. This scoring produces biases, particularly when there are more "noise" trials than "signal" trials, as there were in this study which used a 2:1 noise:signal ratio.

Here's a demonstration using an extreme scenario. Assume three subjects cannot discriminate signal from noise at all but have different response biases:

Subject #1: neutral response bias
10 correct trials: neutral response bias = 5/10 correct
20 incorrect trials: no response bias = 10/20 correct
Overall score = 15/30 correct = 50% accuracy (a valid result)

Subject #2: 100% "yes" response bias
10 correct trials: 100% "yes" response bias = 10/10 correct
20 incorrect trials: 100% "yes" response bias = 0/20 correct
Overall score = 10/30 correct = 33% correct

Subject #3: 100% "no" response bias
10 correct trials: 100% "no" response bias = 0/10 correct
20 incorrect trials: 100% "no" response bias = 20/20 correct
Overall score = 20/30 correct = 67% correct

So you get different accuracy scores depending on the subject's response bias, independently of his/her actual ability to discriminate signal from noise. Of course these biases may not be THIS extreme in practice, but they will find their way into the data, and taint the results -- this is the reason for the d-prime statistic. Is it possible that the frontal and parietal regions are correlating with different response biases? Some people have argued that frontal cortex is important for response selection...

Upshot: the lesion analyses reported in this paper absolutely have to be redone using d-prime measures instead of overall accuracy. Until then, we cannot be sure how much of these findings are driven by response bias.

Some of you are still very confident that even with d-prime, the results will look similar. I wonder myself whether it will change much. But there's another reason to be suspicious...

2. Figure 5. This figure shows the correlation between the percentage of lesioned tissue in the IFG and overall score on gesture recognition, broken down by gesture type, transitive and intransitive. GRD+ and GRD- patients are included in these correlations. Both correlations are reported as significant, with r-values > . 55; not bad. But it is clear from looking at the graphs that the effect is driven by the -GRD group, who all cluster together.

But isn't that part of point? Patients without gesture recognition deficits don't have much damage to IFG, so they should cluster together, right? True, but it should also be the case that for those patients WITH gesture recognition deficits there should still be a strong correlation with amount of lesioned tissue in IFG (assuming there's enough variance in the lesioned tissue, which there is). However, if you remove the -GRD patients (the square symbols), it looks as though the regression line would be completely flat! This is true even thought the percentage of damaged tissue in the +GRD group ranges from ~5% to ~55%. In fact, eyeballing the top graph you can see that the three patients with the LEAST amount of damage to IFG (~to the amount damaged in the -GRD group!) are performing the same as the four patients who have the MOST IFG damage! Further, all the patients in between these two extremes are distributed in a nebulous cloud that kind of looks like an "X" that is equally wide as it is tall (can you find the cross-over double-dissociation in the X?) If IFG was the critical substrate for gesture recognition, one would expect some kind of pattern here, yet there is none.

It seems quite clear that IFG damage is NOT predicting gesture discrimination deficits. Something else is driving the effect that shows up in the beautiful lesion analyses.

Overall conclusion: Pazzaglia et al. made a valiant effort, and should be commended for looking at their data from so many angles, and for reporting their findings openly and straightforwardly. As it stands, however, the data do not support the mirror neuron theory of action understanding. In fact, the data seem to indicate (i) action production and action understanding dissociate, and (ii) damage to the presumed human homologue of monkey area F5 does not correlate with action understanding deficits.

I would love to see the follow up paper that re-analyzes all the data using d-primes. Maybe then we'll have at least one piece of evidence supporting the mirror neuron theory of action understanding. To date, there isn't any evidence in support of the theory.


Anonymous said...

Yes, it's a very interesting paper and they've been very thorough. A d-prime analysis would indeed be interesting. I don't really know very much about gestural understanding, but I think the other main criticism you make highlights a crucial methodological problem we have to deal with in all patient studies when trying to create structure/function mappings.

In short, I think they've biased themselves toward the problems you've highlighted in their correlations by the way they generated their original lesion overlap subtraction map, i.e. making a binary classification between GRD+ / GRD- patients. In essence, they threw away any power they could have got from behavioural variance within each group at this point.

It's great that they've attempted to use a continuous scale of tissue damage within the IFG (i.e. % of lesioned voxels within the IFG) rather than a binary "IFG Damaged"/"IFG Intact" classification, but this isn't a great deal of use if they've used a binary classification of patients to locate the IFG in the first place. If they'd instead used a continuous measure of structural integrity voxel-by-voxel across the whole brain then they could have then used the continuous behavioural data as well to search for areas where damage really does correlate with performance.

Greg Hickok said...

True, the categorization and subtraction method does throw out a lot of variance, but what I liked about this paper, was that the data were examined using a range of different analysis approaches. So the VLSM approach used all the data in its continuous form (and found IFG localization), and the graphs of the (non-)correlation that I highlighted are basically ROI analyses: anatomically defined regions were examined for correlations between amount of tissue damage and behavioral performance. Again, significant correlations were found, but when we see the plot, it becomes clear that something funny is going on, as I discussed.

I have nothing but praise for these investigators because they could have easily stopped after showing the subtraction and VLSM images, and we would have never known that there may be underlying problems with the result. Instead they showed us the actual data in the critical region, and as a result we learned something.

In fact, we learned two things: (1) the IFG may not be critical for gesture discrimination, and (2) even the cleanest looking data can fail to reveal the truth.

Anonymous said...

You're right. Sorry, I misread the VLSM part and thought they'd only performed that analysis on the IFG in isolation. This is a nice result. VLSM is a BIG improvement on lesion overlap maps, but it doesn't quite make use of all the data as it still involves a binary classification i.e. classify tissue as damaged/undamaged on a voxel-by-voxel basis (in this case, I think they simply marked all voxels within each manually drawn lesion outline as damaged). To my mind, using a continuous measure of tissue integrity at each voxel (derived from a measure of signal intensity) and testing this against behavioural performance is a more powerful way to do this sort of analysis. Firstly, it's more sensitive when detecting tissue damage than are manual methods (signal intensity typically needs to be ~20% lower than normal before it can be detected by eye) and it also allows you to examine the regression fit graphically to see which subjects are driving it. I'm a fan of the VBM technique (when used properly) for answering this sort of question, although I know a lot of people out there aren't.

It's an impressive paper in its thoroughness, as you say. I've done quite a lot of imaging work with patients, and know that an immense amount of work must have gone into this study. Patient data is always fascinatingly complex, so it's great that the authors have looked at it from so many angles.

Anonymous said...

I agree that this type of work is the best evidence that mirror networks may be involved in action understanding (see also Bosbach et al., Nature Neuroscience 2005, 8;1295 for intriguing insights).

A few problems here though; first, they don't actually test for action 'understanding'. Their test is really a test of action discrimination/recognition but they refer to it throughout as action understanding. If the patients were unable to predict the consequences of an action or unable to state the goal, I might be more persuaded that the deficit constitutes an impairment in understanding.

Also, they have lumped all subtypes of apraxia together. From the little I know about apraxia, this seems a little odd. Ideomotor apraxia (intact object knowledge but impaired ability to imitate) is a result of deficits to the 'visuomotor' route to action, whereas ideational/conceptual apraxia is characterised by deficits to object semantics (patients cannot link an object with the appropriate action). From what is known about action execution networks I'd be surprised if there was a single area in which lesions would result in both of these conditions.

I'm sure they didn't separate the groups for power reasons, but given that they tested for subtypes of apraxia in the assessment phase, it would have been nice to see how the results look between the groups.