Ten years ago the action understanding interpretation of monkey mirror neurons was the only game in town. There really was no other viable account so even if there were problems with the theory (e.g., 8 in particular), it was the best we had. Now there are alternative explanations. Cecelia Heyes has argued that they reflect learned sensorimotor associations (that don't support understanding), recent writings of Michael Arbib and separately James Kilner have argued that they fundamentally serve a motor control function but which are used fruitfully to augment perceptual function via predictive coding, and I have argued for something of a hybrid between Heyes and Arbib/Kilner: MNs reflect learned sensorimotor associations that are critical for motor control (action selection specifically) and may modulate perception a tiny bit under rather rare circumstances.
This is great progress because it means we are now in position to evaluate the various theories against existing data and just see which one does a better job of explaining the facts.
I have argued extensively that the action understanding theory does not hold up well to lesion data. Disruption of the mirror system by stroke, sodium amytal, degenerative disease, or developmental disease does not impair action understanding in the way that the Parma story should predict. Add to this an impressive, new, large N study on gesture comprehension, and the evidence against the action understanding theory in humans is overwhelming.
But what about monkey mirror neurons? They still look like they are coding some sort of action understanding, right? Not if you actually look at the data rather than reading the headlines.
I discussed this issue in my debate with Gallese. You can watch the whole thing here, but to put the argument into a condensed form, I reiterate it below.
First, what got everyone so excited about mirror neurons in the first place is that some of them showed a fairly strict congruence in their response preference for executed and observed actions: cells that responded to, say, whole hand grasping in execution and observation. There are other, more broadly congruent mirror neurons too, but these took a theoretical back seat in the 1990s. But there was a problem: strictly congruent mirror neurons aren't that useful for understanding because they can't recognize that grasping with a whole hand grip and grasping with a pincher grip are both instances of grasping. They are simply too specific. So the bulk of the theoretical work with monkey mirror neurons has shifted to broadly congruent mirror neurons, which in fact, are more common anyway (see below). Here's some quotes from this paper by the Parma group to prove that I'm not making this stuff up:
How is understanding achieved?
“The similarity between the motor representation generated in observation and that generated during motor behavior allows the observer to understand others’ actions, without the necessity for inferential processing.”
What counts as similar?
“neurons in F5 code the goal of the motor act [grasping, holding, tearing], regardless of how it is achieved.”
“The defining characteristic of F5 mirror neurons is that they fire in response to the presentation of a motor act, which is congruent with the one coded motorically by the same neuron.”
Which types of mirror neurons are critical?
“the vast majority of F5 mirror neurons, termed broadly congruent respond to different motor acts, provided that they serve the same goal (Gallese et al. 1996).
“Thus, like the visual system, where, as postulated by Shepard (1984), resonating elements (neurons or neuronal assemblies) respond maximally to a set of stimuli, but are also able to respond to similar stimuli when they are incomplete or corrupt, a set of mirror neurons (broadly congruent) appears to resonate to all visual stimuli that have sufficient critical features to describe the goal of a given motor act.”
So what do the data show? Most of the relevant data come from the first major mirror neuron study in which a range of actions was examined. After that initial study research on monkey mirror neurons has focused almost exclusively on one type of action: grasping. (We should probably worry about that.) So let's look at that first and more thorough study.
Here is the distribution of cell types:
1. Strictly congruent: 31.5%
Same goal (e.g., grasping), same motor act (e.g., precision grip)
Can’t capture the similarity in goal between grasping with precision grip vs. whole hand grip as pointed out above.
2. Broadly congruent: 60.9%
- Type 1 (12.5%): execution response=“highly specific” (e.g., grasping w/precision grip); observation response more general (precision or whole hand)
Captures the similarity between precision and whole hand grasping, but “interprets” them as one or the other specific type of grasping and doesn’t capture the similarity between grasping with hand & mouth, for example. Therefore these have a similar problem to the strictly congruent MNs.
- Type 2 (82%): execution response=one goal (e.g., grasping); observation response > 1 goal (e.g., grasping or manipulating)
Falsely collapses different goals onto a single goal, i.e., confuses manipulating and grasping.
- Type 3 (5%): execution response=grasping; observation response=grasping with hand, grasping with mouth
Responds to goals! This is a useful subtype for action understanding. But only 3 cells out of 92 mirror neurons & only one goal represented (grasping). If you want to maintain an action understanding theory, this is what you have to hang your hat on.
3. Non-congruent (7.6%)
No obvious relation between execution and observation preferences. Not useful for understanding.
There are more problems, which may apply to the 3/92 cells that have the right response properties for understanding, making their suitability for understanding questionable. Mirror neurons are sensitive to all sorts of features that have nothing to do with action understanding. Here's a list:
And indeed the Parma group has acknowledged this and claimed that mirror neuron system "contributes to choosing appropriate behavioral responses to those actions" (Caggiano et al. 2009)
Notice that all of these response properties make sense if this system is simply coding relations between a range of actions and a range of possible action responses. For example Type 2 congruent mirror neurons (by far the most common) take multiple possible observed actions and map them onto a single executed response. This is useful for motor selection if a single motor response is appropriate to multiple cue types but not useful for understanding. For another example, the value of the grasped object should modulate response selection (do I want to grasp that object?) but should not play a role in action understanding.
The evidence is overwhelming:
1. Monkey mirror neurons have response properties that do not fit the action understanding theory and instead fit an action selection account.
2. Human data from stroke and other neurological conditions clearly demonstrate a dissociation between action execution and action understanding ability in a range of domains (speech, praxis, sign language, emotional face recognition).
If this isn't convincing, what evidence do we need to reject the action understanding account? Or is it an unfalsifiable theory?