Jeff Bowers has published a paper or two arguing for the viability of grandmother cells -- cells that represent whole "objects" such as a specific face (or your grandmother's face). At issue, of course, is whether the brain represents information in a localist or distributed fashion and Jeff has used his case for grandmother cells as evidence against a basic assumption of parallel distributed processing (PDP) models. But the PDP folks don't seem to think "distributed" is a necessary property of PDP models. So in the guest post below, Jeff asks, What does the D in PDP actually mean? This is an interesting question, and Jeff would like to know your thoughts (see the new survey to respond). I'd also be interested in your thoughts on grandmother cells!
Guest Post from Jeff Bowers:
I’ve been involved in a recent debate regarding the relative merits of localist representations and the distributed representations learned in Parallel Distributed Processing (PDP) networks. By localist, I mean something like a word unit in the interactive activation (IA) model – a unit that represents a specific word (like a “grandmother cell”). By distributed, I mean that a familiar word (or an object or a face, etc.) is coded as a pattern of activation across a set of units, with no single unit sufficient for representing an item (you need to consider the complete pattern). In Bowers (2009, 2010) I argue that the neuroscience is more consistent with localist coding compared to the distributed representations in PDP networks, contrary to the widespread assumption in the cognitive science community. That is, single-cell recordings of neurons in cortex and hippocampus often reveal neurons that are remarkably selective in their responding (e.g., a neuron that responds to one face out of many). I took this to be more consistent with localist compared to distributed PDP theories.
This post, however, is not with regards to whether localist or PDP models are more biological plausible. Rather, I’m curious as to what people think is the theory behind PDP models; specifically, what is your understanding regarding the relation between distributed representations and PDP models? In Bowers (2009, 2010) I claim that PDP models are committed to the claim that information is coded in a distributed format rather than a localist format. On this view, the IA model of word identification that includes single units to code for specific words (e.g., a DOG unit) is not a PDP model. Neither are neural networks that learn localist representations, like the ART models of Grossberg. On my understanding, a key (necessary) feature of the Seidenberg and McClelland model of word naming that makes it part of the PDP family is that it learns distributed representations of words – it gets rid of localist word representations.
However, Plaut and McClelland (2010) challenge this characterization of PDP models. That is, they write:
In accounting for human behavior, one aspect of PDP models
that is especially critical is their reliance on interactivity and
graded constraint satisfaction to derive an interpretation of an input
or to select an action that is maximally consistent with all of the
system’s knowledge (as encoded in connection weights between
units). In this regard, models with local and distributed representations
can be very similar, and a number of localist models remain
highly useful and influential (e.g., Dell, 1986; McClelland &
Elman, 1986; McClelland & Rumlehart, 1981; McRae, Spivey-
Knowlton, & Tenenhaus, 1998). In fact, given their clear and
extensive reliance on parallel distributed processing, we think it
makes perfect sense to speak of localist PDP models alongside
distributed ones. (p 289).
That is, they argue that the PDP approach is not in fact committed to distributed representations. Elsewhere they write:
In fact, the approach takes no specific stance on the number of units that
should be active in representing a given entity or in the degree
of similarity of the entities to which a given unit responds.
Rather, one of the main tenets of the approach is to discover
rather than stipulate representations (p. 286)
So on this view, the PDP approach does not rule out the possibility that a neural network might actually learn localist grandmother cells in the appropriate training conditions.
With this as background, I would be interested in people’s views on this. Here is my question:
Are PDP theories of cognition committed to the claim that knowledge is coded in a distributed rather than a localist format? [see new survey]
Thanks for your thoughts,
Bowers JS (2009). On the biological plausibility of grandmother cells: implications for neural network theories in psychology and neuroscience. Psychological review, 116 (1), 220-51 PMID: 19159155
Bowers JS (2010). More on grandmother cells and the biological implausibility of PDP models of cognition: a reply to Plaut and McClelland (2010) and Quian Quiroga and Kreiman (2010). Psychological review, 117 (1) PMID: 20063980
Plaut, D., & McClelland, J. (2010). Locating object knowledge in the brain: Comment on Bowers’s (2009) attempt to revive the grandmother cell hypothesis. Psychological Review, 117 (1), 284-288 DOI: 10.1037/a0017101