More from William Matchin -- Reply to Erich Jarvis:
At the most recent SfN, Erich
Jarvis gave the opening presidential address on the functional neuroanatomy of
language, which I commented on and critiqued in my recent blog post for Talking
Brains (
http://www.talkingbrains.org/2017/11/abstractness-innateness-and-modality.html).
Erich has briefly responded to my writing on Twitter and suggested a debate. Few
things could give me more pleasure than a productive debate on central issues
concerning the nature of human language. The following is a response to his
comments in the context of a more in-depth exploration of the issues under
discussion regarding the phenotype of language (both cognitive/behavioral and
neurological) and its evolution. In general, I believe that we have far more points
of agreement than disagreement, although I believe there remain fundamental divides,
not least of which is the nature of sign language and its connection to spoken
language, which I believe reveals the essential essence of language itself. Erich’s
comments, and his quotation of my words, are in bold.
EJ:
Dear William. I did some holiday reading of your #SfN17 blog. You said you like a debate.
In this series of tweets I challenge a number of your claims as misguided in a
manner I have commonly seen in linguistics. But I support your effort in summarizing
the major language talks. Some of
your comments about my #SfN17 presidential talk indicate that you did
not correctly hear what I said, and thus made some incorrect statements about
what I said. I will correct them.
While I consider myself a linguist, my degrees and primary training are
in cognitive psychology and neuroscience. I don’t think our disagreements stem
from the distinction between the fields of linguistics and neuroscience but
rather how to characterize the human language phenotype in comparison to the cognitive
systems found in other animals.
WM:
central question Erich Jarvis addressed...was whether human language is an
elaborate form of vocal learning seen in other animals or ...a horse of a
different color
EJ:
No, I said vocal learning is 1 component of spoken-language, & yes more
elaborate in humans.
WM: Jarvis is an expert
of the biology of birdsong, & he argued that human language is continuous
with vocal learning in non-human organisms...
EJ: I argued that vocal
learning is itself a continuous trait, that contributes to spoken language. I
didn't arguing they're the same
Both of us agree that language has
multiple interacting components, many of which are shared with other animals. I
think Erich’s work on this topic is extremely helpful for understanding the
function of certain neural language circuits in humans (more on this below).
Our agreement goes further than this in that we agree that language-specific biological
components of language are minimal. In this respect, the perspectives of Chomsky
and colleagues (e.g. Hauser, Chomsky & Fitch, 2002; Berwick & Chomsky,
2015; Bolhuis et al., 2014), myself, and Erich and his colleagues are fundamentally
aligned. The disagreement concerns which capacities are language-specific (if
any) and the impact that these components have on the behavioral and cognitive lives
of humans.
It seems from Erich’s presentation
and his published work that he asserts that vocal learning is the central
component of human language. Consider this first line of the abstract of Erich’s
2004 review paper: “Vocal learning, the
substrate for human language, is a rare trait…” [emphasis mine]. I take the
phrase “the substrate for human language” to mean that it is a critical component
of human language, a sine qua non. If
Erich does not endorse this position, then we are closer to agreement –
clarification on this point would help greatly. This is especially important
for considering sign language and its relation to spoken language.
WM:
don't think Jarvis mentioned sign language once during entire talk (except
non-human gestures)
EJ:
I believe language-like perception & production circuits (including sign)
exist before speech. Speech circuits inherited their functions & all became
more advanced in humans.
WM:
All of these observations tell us that there is nothing important about
language that must be expressed in the auditory-vocal modality.
EJ:
Agree with “must”, but auditory-vocal modality is dominant. Its hard to read or
think w/o silently talking to & hearing yourself.
My understanding of Erich’s main theoretical
position, the motor theory of vocal learning origin (Feenders et al., 2008), is
that many animals have cortical-subcortical motor circuits that allow for precise
control of peripheral appendages (e.g., hands, claws, paws, wings) that were
then duplicated and adapted for use in control over the vocal tract. This effect
occurred independently in vocal learning animals yet relies on a common genetic
substrate, and underlies the ability of humans to learn and produce complex
speech sequences.
I strongly endorse this. One
feature I appreciate about this proposal is that it focuses on a very specific
neuroanatomical-functional circuit (with accompanying genetic underpinnings).
This view suggests that speech may be special in some ways, but it clearly has
its basis in pre-existing neural circuitry found both in other cognitive
domains (such as motor control of the arms and legs) as well as other animals
(such as vocal learning birds). It allows for tractable comparative behavioral
and neuroscience research that may prove useful for understanding the human capacity
for language. As I mentioned above, this approach is aligned with the
minimalist approach of Chomsky and colleagues that seeks to eliminate as much
domain-specific machinery from theories of language as possible. We’re all on
the same page here.
In fact, I go one further than
Erich. In a recent paper (Matchin, 2017) I laid out the evidence for the
hypothesis that language-specific portions of anterior Broca’s area acquire
their proclivity for higher-level aspects of language through a process of
neuronal retuning, in which pre-existing computational circuits for speech
production are harnessed for more abstract language functions (either genetic
exaptation of developmental neuronal recycling). On the point that language
makes use of pre-existing computational machinery, I think that Erich and I
agree heartily.
If so, where do Erich and I disagree?
For one, we disagree about the nature of sign language and its relation to
spoken language. Erich appears to posit that sign language and spoken language
inhabit similar yet distinct circuits in the brain, and does not seem to
endorse the view that sign and speech share the same core linguistic computations
that are absent in non-human organisms (on this latter point it is difficult to
make out Erich’s view). Erich’s papers and talks only discuss vocal learning
and the classic speech circuits for production in posterior Broca’s area and
perception in superior temporal gyrus which may be specialized for auditory-vocal
language. Yet his work ignores the well-supported advances made in neuroimaging
and aphasia research in the last several decades regarding the localization of
central aspects of language to association cortex outside of these speech
regions (see Hickok & Poeppel, 2007; Mesulam et al., 2015; Fridriksson et
al., 2016; and Blank et al. 2016 for reviews).
Humans are the only organism I am
aware of that can communicate equally well in either the auditory-vocal or
visual-manual modality. Much converging data from psycholinguistic experiments,
linguistic analyses, developmental studies, and neuroscience that sign and
spoken language share many core properties that appear to be central to the
human language phenotype, many of which are qualitatively distinct from other
animals (see e.g. Klima & Bellugi, 1979; Petitto, 1994; Sandler &
Lillo-Martin, 2006; MacSweeney et al., 2002; Emmorey et al., 2007; Leonard et
al., 2012; and Matchin et al., 2017). While it may be the case that spoken
language is the default form of communication, as I pointed out in my original
blog post one can easily imagine an alternate history of our world in which the
dominant languages are sign languages, with obscure spoken languages used by
blind communities.
By contrast, I posit a mixed view:
sign and speech share brain circuits for lexical access, syntax, and semantics,
while systems of perception and production may inhabit distinct cortical
locations. Consider the figure below for an example regarding my view about
language comprehension circuits in the brain. The yellow areas represent
secondary visual cortical regions involved in the perception of sign, which are
distinct from the blue areas representing secondary auditory cortex involved in
the perception of speech. Both systems converge on a common underlying language
system that is neutral with respect to sensory modality, involved in the
perception of words, syntactic combination, and the interpretation of meaning. This
includes human-unique anatomical asymmetry between the left and right superior
temporal sulcus (Leroy et al. 2015), smack in the middle of the red areas
indicated on the brain figure. Given the standard assumption of left hemisphere
language dominance, this suggests that humans possess unique brain organization
in the superior temporal sulcus underlying their capacity for language.
I think that when seriously
examining the computational properties of sign language and its neurobiology,
it is difficult not to conclude that there is a central human-unique component
of language that is modality-independent. Naturally, this raises the question
of what exactly these components are,
which I turn to next.
WM:
3 main challenges to a continuity hypothesis were entirely omitted or
extravagantly minimized: syntax, semantics, & sign language
EJ:
I mentioned syntax (e.g. Foxp2 impaired mice Chabout 2015) & semantics
(e.g. usage vocal learning) both in support of continuum hypothesis
WM:
3 main challenges to a continuity hypothesis were entirely omitted or
extravagantly minimized: syntax, semantics, & sign language
EJ:
In order for something to be continuous, you need to have a more advanced form
and a more 'extravagantly' minimized form. That makes sense.
Erich has misunderstood me here; my
phrase “extravagantly minimized” refers to his minimal review of the phenotype
of human language. It is possible to claim that syntax and semantics are in
large part conserved between non-human organisms and humans if the details of their
key properties have not been discussed. These properties have been supported by
decades of neuroscience, psychology, and linguistics research. Here I will
simply list some them, for which there is no evidence in bird vocalizations, as
reviewed in Berwick et al. (2012), a paper that I strongly recommend:
·
Unbounded non-adjacent
dependencies
·
Phrases “labeled” by element features
·
Asymmetrical hierarchical phrases
·
Hierarchical self-embedding of
phrases of the same type
·
Hierarchical embedding of phrases
of different types
·
Phonologically null chunks
·
Displacement of phrases
·
Duality of phrase interpretation
·
Crossed-serial dependencies
·
Productive
link to “concepts”
While all of these are important,
I will focus on the last: human language syntax combines meaning-bearing
elements to produce new sentences with novel meanings. This is an essential component
of language. For instance, I can combine (in very particular ways) the phrase “the
alien” with the verb “serenaded” with the phrase “the giant squid”, and the
resulting sentence produces a sentence that has (hopefully) never been
perceived before. This does not happen in birdsong, where individual meaningless
elements combine to produce meaningless expressions that serve a monolithic
function of territory defense or mate attraction.
Erich suggested that this sort of
combinatorics does occur to some extent in certain vocal learning birds. As
evidence, he played a video of a parakeet producing the following: “What seems
to be the problem officer? I am not a crook. My name is Disco – I’m a parakeet.”
In a different talk, Erich stated the following “Disco learned up to 400 words
in four years and he could recombine words into new sentences, many times they
don’t have meaning to the listening people but other times they do, like you
hear here. That’s quite remarkable.”
I agree that it’s remarkable. However,
when watching the full video of Disco, one notices that this bird starts to
sound a bit … well, not human-like. Here’s more from Disco https://www.youtube.com/watch?v=EFJeY9fL5tk:
“Baby disco. Give me a kiss [kiss
sound]. Gonna get that belly. Gonna get that belly. Gonna get that belly. Gonna
get that belly. Gonna get that belly. Disco, meet the Disco, he’s a dabba-dabba
do-bird. [unintelligible]. Bird bird. [beat-boxing]. I’m a parakeet – bird to
your mother. What seems to be the profficer. I am not a crook – my name is
Disco. I’m a parakeet. What seems to be the problem officer. I am not a crook –
my name is Disco. I’m a parakeet. [beat-boxing]. I’m a parakeet – bird to your
mother. Nobody … [distracted]. Disco, meet the Disco, he’s a dabba-dabba
do-bird. [unintelligible]. Bird bird. Disco budgie in the house tonight. Eat
some millet and have a good tide [sic]. Domo arigato Mr. Roboto. [chirp] Oh
there goes Tokyo. Go go Godzilla. Uh shadoobay shattered, shattered. Shadoobay
shattered, shattered. Domo arigato Mr. Roboto. Disco budgie in the house
tonight. Eat some millet and have a tide. Domo arigato Mr. Roboto. Domo. Domo
arigato Mr. … [distracted]. Domo arigato Mr. … [distracted]. Beat box budgie.
Domo arigato Mr. Roboto. Domo. Domo arigato Mr. … [distracted]. Don’t just
stand there, bust a move. [unintelligible] and prosper. [unintelligible] and
prosperd. [unintelligible]. Where’s the beef. Shadoobay shattered, shattered.
Shattered. Domo arigato Mr. Roboto. I’m Disco and I know it. I’m Disco and I
know it. What did Momma say. Nobody puts baby bird in corner. Never shake a
baby bird. Never shake a baby bird. Nobody puts baby bird in corner. Never
shake a baby bird. What seems to be the problem officer? I am not a crook – my
name is Disco. I’m a parakeet. Mean poopachine. What did Momma say. There goes
Tokyo. Go go Godzilla. [chirping] Disco. There goes Tokyo. Go Godzilla. There
goes Tokyo. Go go Godzilla. There goes Tokyo. Go gox budgie. Baby got bax
budgie. [unintelligible] that belly. Gotta get that belly. Gotta get that
belly. What you talking about Disco. Ooh la la, give me a big kiss. Ooh la la,
give me a big kiss. Ooh la la, give me a big kiss. [kiss sound].
Disco is not talking in a way that a healthy human does (in
fact, he sounds a bit like a patient with fluent aphasia). He is ejecting the
patterns of speech that he has perceived out with minimal recombination of
underlying meaningful elements. This is not a particularly novel observation
about parakeets. Even as phonologically complex as bird vocalizations are
compared to the relatively much simpler dog barks, it is clear that they do not
differ substantially in the
complexity of
meaning that
Gary Larson attributed to dog barks.
One last comment. I believe that some of the apparent
conflict in positions regarding the degree of continuity of human language with
vocal learning in non-human animals stems from the fact that Chomsky and
colleagues often endorse the neurobiological model of syntax of Friederici
(2012; 2017), in which the core of the human language syntax is localized to
the posterior portion of Broca’s area. If the core of human language is
localized to Broca’s area, I can very well see why Erich and colleagues would
point out that this core is not specific to humans and very likely draws on
very similar computational systems that are found in other organisms and
domains such as vocal learning. As I have said, I agree 100% on this point (see
Matchin, 2017). I think particular subregions of Broca’s area are involved in
working memory and sequencing operations particularly important for language
production that are heavily conserved across species. I (and many others) have
pointed out that the Friederici model of syntax in Broca’s area is incorrect
(Matchin et al., 2014; Matchin, 2017). I am currently working on a paper with
Greg Hickok that proposes a theory in which the core of combinatorial syntax
and semantics is localized to the middle-posterior superior temporal sulcus.
When that paper comes out, I would be very interested to hearing Erich’s
thoughts and having another productive debate and/or discussion.
References
Berwick, R. C., & Chomsky,
N. (2015). Why only us: Language and evolution. MIT press.
Berwick, R. C., Beckers, G.
J., Okanoya, K., & Bolhuis, J. J. (2012). A bird’s eye view of human
language evolution. Frontiers in evolutionary neuroscience, 4.
Blank, I., Balewski, Z.,
Mahowald, K., & Fedorenko, E. (2016). Syntactic processing is distributed
across the language system. Neuroimage, 127, 307-323.
Bolhuis, J. J., Tattersall,
I., Chomsky, N., & Berwick, R. C. (2014). How could language have
evolved?. PLoS biology, 12(8), e1001934.
Emmorey, K., Mehta, S., &
Grabowski, T. J. (2007). The neural correlates of sign versus word
production. Neuroimage, 36(1), 202-208.
Feenders, G., Liedvogel, M.,
Rivas, M., Zapka, M., Horita, H., Hara, E., ... & Jarvis, E. D. (2008).
Molecular mapping of movement-associated areas in the avian brain: a motor
theory for vocal learning origin. PLoS One, 3(3),
e1768.
Fridriksson, J., Yourganov,
G., Bonilha, L., Basilakos, A., Den Ouden, D. B., & Rorden, C. (2016).
Revealing the dual streams of speech processing. Proceedings of the
National Academy of Sciences, 201614038.
Hauser, M. D., Chomsky, N.,
& Fitch, W. T. (2002). The faculty of language: what is it, who has it, and
how did it evolve?. Science, 298(5598), 1569-1579.
Hickok, G., & Poeppel, D.
(2007). The cortical organization of speech processing. Nature Reviews
Neuroscience, 8(5), 393-402.
Jarvis, E. D. (2004). Learned
birdsong and the neurobiology of human language. Annals of the New York
Academy of Sciences, 1016(1), 749-777.
Klima, E., & Bellugi, U.
(1979). The signs of language. Harvard University Press.
Leonard, M. K., Ramirez, N.
F., Torres, C., Travis, K. E., Hatrak, M., Mayberry, R. I., & Halgren, E.
(2012). Signed words in the congenitally deaf evoke typical late lexicosemantic
responses with no early visual responses in left superior temporal
cortex. Journal of Neuroscience, 32(28), 9700-9705.
Leroy, F., Cai, Q., Bogart, S.
L., Dubois, J., Coulon, O., Monzalvo, K., ... & Lin, C. P. (2015). New
human-specific brain landmark: the depth asymmetry of superior temporal
sulcus. Proceedings of the National Academy of Sciences, 112(4),
1208-1213.
MacSweeney, M., Woll, B.,
Campbell, R., McGuire, P. K., David, A. S., Williams, S. C., ... & Brammer,
M. J. (2002). Neural systems underlying British Sign Language and audio‐visual
English processing in native users. Brain, 125(7),
1583-1593.
Matchin, W., Sprouse, J.,
& Hickok, G. (2014). A structural distance effect for backward anaphora in
Broca’s area: An fMRI study. Brain and language, 138,
1-11.
Matchin, W. G. (2017). A
neuronal retuning hypothesis of sentence-specificity in Broca’s area. Psychonomic
bulletin & review, 1-13.
Matchin, W., Villwock, A.,
Roth, A., Ilkbasaran, D., Hatrak, M., Davenport, T., Halgren, E. &
Mayberry, M. (2017). The
cortical organization of syntactic processing in American Sign Language:
Evidence from a parametric manipulation of constituent structure in fMRI and
MEG. Poster presented at the 9th annual meeting of the Society for
the Neurobiology of Language.
Mesulam, M. M., Rogalski, E.
J., Wieneke, C., Hurley, R. S., Geula, C., Bigio, E. H., ... & Weintraub,
S. (2014). Primary progressive aphasia and the evolving neurology of the
language network. Nature Reviews Neurology, 10(10),
554-569.
Petitto, L. A. (1994). Are
signed languages ‘real’languages. Evidence from American Sign Language
and Langue des Signes Québécoise. Reprinted from: Signpost (International
Quarterly of the Sign Linguistics Association), 7(3), 1-10.
Sandler, W., &
Lillo-Martin, D. (2006). Sign language and linguistic universals.
Cambridge University Press.