Here's what they wrote (italics theirs):
...these [mirror] neurons are primarily involved in the understanding of the meaning of 'motor events', i.e., of actions performed by others. (p. 97)
...this is why, when it [the monkey] sees the experimenter shaping his hand into a precision grip and moving it towards the food, it immediately perceives the meaning of the 'motor events' and interprets them in terms of an intentional act. (p. 98)
This is fairly standard mirror neuron speak. It was the next section that made me decide to stop reading.
There is, however an obvious objection to this: as discussed above, neurons which respond selectively to the observation of the body movements of others, and in certain cases to hand-object interactions, have been found in the anterior region of the superior temporal sulcus (STS). We have mentioned that the STS areas are connected with the visual, occipital, and temporal cortical areas, so forming a circuit which is in many ways parallel to that of the ventral stream. What point would there be, therefore, in proposing a mirror neuron system that would code in the observer's brain the actions of others in terms of his own motor act? Would it not be much easier to assume that understanding the actions of others rests on purely visual mechanisms of analysis and synthesis of the various elements that constitute the observed action, without any kind of motor involvement on the part of the observer? (p. 98-99)
A very good question. They go on to note,
Perrett and colleagues demonstrated that the visual codification of actions reaches levels of surprising complexity in the anterior region of the STS. Just as an example, there are neurons which are able to combine information relative to the observation of the direction of the gaze with that of the movements an individual is performing. Such neurons become active only when the monkey sees the experimenter pick up an object on which his gaze is directed. If the experimenter shifts the direction of his gaze, the observation of his action does not trigger any neuron activity worthy of notice. (p. 99)
So why is the STS with its much more selective response properties to action perception not a candidate neural basis for action understanding? The answer is...
However, we must ask whether this selectivity -- or, in more general terms, the capacity to connect different visual aspects of the observed action -- is sufficient to justify using the term 'understanding'. The motor activation characteristic of F5 and PF-PFG adds an element that hardly could be derived from the purely visual properties of STS -- and without which the association of visual features of the action would at best remain casual, without any unitary meaning for the observer. (p. 99, end of paragraph)
Not only is this pure speculation, but this question is NEVER asked of mirror neurons:
However, we must ask whether this selectivity -- or, in more general terms, the capacity to connect motor aspects of the observed action -- is sufficient to justify using the term 'understanding'. The sensory activation characteristic of STS adds an element that hardly could be derived from the far less specified properties of F5 -- and without which the association of sensory-motor features of the action would at best remain casual, without any unitary meaning for the observer.
A typical response to this kind of critique is that, "it's the activity of the WHOLE circuit that is important, not just mirror neurons in F5". But this is vacuous hand-waving. If this is really the claim, then why is the visual percept "casual" and without "unitary meaning" and the motor component the one that adds meaning? Why isn't it the reverse? Why isn't the reverse ever considered?
The other glaring logical party-foul with R&S's claim is that if they are correct, monkeys should only be able to understand actions that mirror neurons code: grasping, tearing, holding, etc. All the others would be casual and without unitary meaning. Does it make sense from an evolutionary standpoint for a system that is only capable of understanding visual actions or events that have a motor representation as well? Or would it be useful for the animal to understand that a hawk circling above is a bad thing? And if you want to claim that the animal doesn't really 'understand' what a circling hawk 'means', that it only reacts to it reflexively, then you are obliged to prove to me that the monkey does 'understand' grasping actions and is not just reacting reflexively.
Here's my guess as to what mirror neurons are doing.
1. Action understanding is primarily coded in the much more sophisticated STS neurons.
2. The F5-parietal lobe circuit performs sensory-motor transformations for the purpose of guiding action.
3. Populations of F5 neurons code specific complex actions such as grasping with the hand using a particular grip, or perhaps these populations are part of the transformation (started in parietal regions) between a sensory event and a specific action.
4. F5-driven actions (or sensory-motor transformations) can be activated by objects (canonical neurons), or by the observation of actions (mirror neurons).
5. [prediction:] Mirror neurons are only one class of action-responsive cells in F5. Others code non-mirror action observation-action execution responses such as when a conspecific presents its back and a grooming action may be elicited.
6. [prediction:] F5 neuron populations are plastic. If the animal is trained to reach for a raisin upon seeing a human waving gesture or a dog's tail wag or a picture of the Empire State Building, F5 cell populations will code this association such that F5 cells may end up responding to tail wagging. (For example see Catmur, et al. 2007, although admittedly this is a human study and may not apply to macaque.)
7. The reason why mirror neurons mirror is because there is an association between seeing a reaching/grasping gesture and executing the same gesture. This could arise either because of natural competitive behavior (seeing another monkey reach may cue the presence of something tasty and generate a competitive reach) or because of the specific experimental training situation.
As far as I know, there is no way empirically to differentiate these ideas from the action understanding theory. However, the present suggestion can explain why STS neurons code actions so much more specifically than mirror neurons (because STS is critically involved in action understanding) and it does not limit 'understanding' to motor behaviors, which seems desirable. I look forward to seeing a flood of studies in Nature and Science testing alternative theories of mirror neuron function. (Yeah, right.)
So what in the world will I talk about if not mirror neurons? Well, the motor theory of speech perception is still on the table. Unlike mirror neurons, that is squarely in my research program. It is also an interesting topic because it provides an excellent test case for mirror neuron theory as it is applied to humans, just like speech was the critical test case for phrenology. (Yes, I am comparing mirror neurons to phrenology -- both very interesting ideas that were unsubstantiated when first proposed and that captured the scientific and public imagination.)
Catmur C, Walsh V, & Heyes C (2007). Sensorimotor learning configures the human mirror system. Current biology : CB, 17 (17), 1527-31 PMID: 17716898