More from William Matchin -- Reply to Erich Jarvis:
At the most recent SfN, Erich Jarvis gave the opening presidential address on the functional neuroanatomy of language, which I commented on and critiqued in my recent blog post for Talking Brains (http://www.talkingbrains.org/2017/11/abstractness-innateness-and-modality.html). Erich has briefly responded to my writing on Twitter and suggested a debate. Few things could give me more pleasure than a productive debate on central issues concerning the nature of human language. The following is a response to his comments in the context of a more in-depth exploration of the issues under discussion regarding the phenotype of language (both cognitive/behavioral and neurological) and its evolution. In general, I believe that we have far more points of agreement than disagreement, although I believe there remain fundamental divides, not least of which is the nature of sign language and its connection to spoken language, which I believe reveals the essential essence of language itself. Erich’s comments, and his quotation of my words, are in bold.
EJ: Dear William. I did some holiday reading of your #SfN17 blog. You said you like a debate. In this series of tweets I challenge a number of your claims as misguided in a manner I have commonly seen in linguistics. But I support your effort in summarizing the major language talks. Some of your comments about my #SfN17 presidential talk indicate that you did not correctly hear what I said, and thus made some incorrect statements about what I said. I will correct them.
While I consider myself a linguist, my degrees and primary training are in cognitive psychology and neuroscience. I don’t think our disagreements stem from the distinction between the fields of linguistics and neuroscience but rather how to characterize the human language phenotype in comparison to the cognitive systems found in other animals.
WM: central question Erich Jarvis addressed...was whether human language is an elaborate form of vocal learning seen in other animals or ...a horse of a different color
EJ: No, I said vocal learning is 1 component of spoken-language, & yes more elaborate in humans.
WM: Jarvis is an expert of the biology of birdsong, & he argued that human language is continuous with vocal learning in non-human organisms...
EJ: I argued that vocal learning is itself a continuous trait, that contributes to spoken language. I didn't arguing they're the same
Both of us agree that language has multiple interacting components, many of which are shared with other animals. I think Erich’s work on this topic is extremely helpful for understanding the function of certain neural language circuits in humans (more on this below). Our agreement goes further than this in that we agree that language-specific biological components of language are minimal. In this respect, the perspectives of Chomsky and colleagues (e.g. Hauser, Chomsky & Fitch, 2002; Berwick & Chomsky, 2015; Bolhuis et al., 2014), myself, and Erich and his colleagues are fundamentally aligned. The disagreement concerns which capacities are language-specific (if any) and the impact that these components have on the behavioral and cognitive lives of humans.
It seems from Erich’s presentation and his published work that he asserts that vocal learning is the central component of human language. Consider this first line of the abstract of Erich’s 2004 review paper: “Vocal learning, the substrate for human language, is a rare trait…” [emphasis mine]. I take the phrase “the substrate for human language” to mean that it is a critical component of human language, a sine qua non. If Erich does not endorse this position, then we are closer to agreement – clarification on this point would help greatly. This is especially important for considering sign language and its relation to spoken language.
WM: don't think Jarvis mentioned sign language once during entire talk (except non-human gestures)
EJ: I believe language-like perception & production circuits (including sign) exist before speech. Speech circuits inherited their functions & all became more advanced in humans.
WM: All of these observations tell us that there is nothing important about language that must be expressed in the auditory-vocal modality.
EJ: Agree with “must”, but auditory-vocal modality is dominant. Its hard to read or think w/o silently talking to & hearing yourself.
My understanding of Erich’s main theoretical position, the motor theory of vocal learning origin (Feenders et al., 2008), is that many animals have cortical-subcortical motor circuits that allow for precise control of peripheral appendages (e.g., hands, claws, paws, wings) that were then duplicated and adapted for use in control over the vocal tract. This effect occurred independently in vocal learning animals yet relies on a common genetic substrate, and underlies the ability of humans to learn and produce complex speech sequences.
I strongly endorse this. One feature I appreciate about this proposal is that it focuses on a very specific neuroanatomical-functional circuit (with accompanying genetic underpinnings). This view suggests that speech may be special in some ways, but it clearly has its basis in pre-existing neural circuitry found both in other cognitive domains (such as motor control of the arms and legs) as well as other animals (such as vocal learning birds). It allows for tractable comparative behavioral and neuroscience research that may prove useful for understanding the human capacity for language. As I mentioned above, this approach is aligned with the minimalist approach of Chomsky and colleagues that seeks to eliminate as much domain-specific machinery from theories of language as possible. We’re all on the same page here.
In fact, I go one further than Erich. In a recent paper (Matchin, 2017) I laid out the evidence for the hypothesis that language-specific portions of anterior Broca’s area acquire their proclivity for higher-level aspects of language through a process of neuronal retuning, in which pre-existing computational circuits for speech production are harnessed for more abstract language functions (either genetic exaptation of developmental neuronal recycling). On the point that language makes use of pre-existing computational machinery, I think that Erich and I agree heartily.
If so, where do Erich and I disagree? For one, we disagree about the nature of sign language and its relation to spoken language. Erich appears to posit that sign language and spoken language inhabit similar yet distinct circuits in the brain, and does not seem to endorse the view that sign and speech share the same core linguistic computations that are absent in non-human organisms (on this latter point it is difficult to make out Erich’s view). Erich’s papers and talks only discuss vocal learning and the classic speech circuits for production in posterior Broca’s area and perception in superior temporal gyrus which may be specialized for auditory-vocal language. Yet his work ignores the well-supported advances made in neuroimaging and aphasia research in the last several decades regarding the localization of central aspects of language to association cortex outside of these speech regions (see Hickok & Poeppel, 2007; Mesulam et al., 2015; Fridriksson et al., 2016; and Blank et al. 2016 for reviews).
Humans are the only organism I am aware of that can communicate equally well in either the auditory-vocal or visual-manual modality. Much converging data from psycholinguistic experiments, linguistic analyses, developmental studies, and neuroscience that sign and spoken language share many core properties that appear to be central to the human language phenotype, many of which are qualitatively distinct from other animals (see e.g. Klima & Bellugi, 1979; Petitto, 1994; Sandler & Lillo-Martin, 2006; MacSweeney et al., 2002; Emmorey et al., 2007; Leonard et al., 2012; and Matchin et al., 2017). While it may be the case that spoken language is the default form of communication, as I pointed out in my original blog post one can easily imagine an alternate history of our world in which the dominant languages are sign languages, with obscure spoken languages used by blind communities.
By contrast, I posit a mixed view: sign and speech share brain circuits for lexical access, syntax, and semantics, while systems of perception and production may inhabit distinct cortical locations. Consider the figure below for an example regarding my view about language comprehension circuits in the brain. The yellow areas represent secondary visual cortical regions involved in the perception of sign, which are distinct from the blue areas representing secondary auditory cortex involved in the perception of speech. Both systems converge on a common underlying language system that is neutral with respect to sensory modality, involved in the perception of words, syntactic combination, and the interpretation of meaning. This includes human-unique anatomical asymmetry between the left and right superior temporal sulcus (Leroy et al. 2015), smack in the middle of the red areas indicated on the brain figure. Given the standard assumption of left hemisphere language dominance, this suggests that humans possess unique brain organization in the superior temporal sulcus underlying their capacity for language.
I think that when seriously examining the computational properties of sign language and its neurobiology, it is difficult not to conclude that there is a central human-unique component of language that is modality-independent. Naturally, this raises the question of what exactly these components are, which I turn to next.
WM: 3 main challenges to a continuity hypothesis were entirely omitted or extravagantly minimized: syntax, semantics, & sign language
EJ: I mentioned syntax (e.g. Foxp2 impaired mice Chabout 2015) & semantics (e.g. usage vocal learning) both in support of continuum hypothesis
WM: 3 main challenges to a continuity hypothesis were entirely omitted or extravagantly minimized: syntax, semantics, & sign language
EJ: In order for something to be continuous, you need to have a more advanced form and a more 'extravagantly' minimized form. That makes sense.
Erich has misunderstood me here; my phrase “extravagantly minimized” refers to his minimal review of the phenotype of human language. It is possible to claim that syntax and semantics are in large part conserved between non-human organisms and humans if the details of their key properties have not been discussed. These properties have been supported by decades of neuroscience, psychology, and linguistics research. Here I will simply list some them, for which there is no evidence in bird vocalizations, as reviewed in Berwick et al. (2012), a paper that I strongly recommend:
· Unbounded non-adjacent dependencies
· Phrases “labeled” by element features
· Asymmetrical hierarchical phrases
· Hierarchical self-embedding of phrases of the same type
· Hierarchical embedding of phrases of different types
· Phonologically null chunks
· Displacement of phrases
· Duality of phrase interpretation
· Crossed-serial dependencies
· Productive link to “concepts”
While all of these are important, I will focus on the last: human language syntax combines meaning-bearing elements to produce new sentences with novel meanings. This is an essential component of language. For instance, I can combine (in very particular ways) the phrase “the alien” with the verb “serenaded” with the phrase “the giant squid”, and the resulting sentence produces a sentence that has (hopefully) never been perceived before. This does not happen in birdsong, where individual meaningless elements combine to produce meaningless expressions that serve a monolithic function of territory defense or mate attraction.
Erich suggested that this sort of combinatorics does occur to some extent in certain vocal learning birds. As evidence, he played a video of a parakeet producing the following: “What seems to be the problem officer? I am not a crook. My name is Disco – I’m a parakeet.” In a different talk, Erich stated the following “Disco learned up to 400 words in four years and he could recombine words into new sentences, many times they don’t have meaning to the listening people but other times they do, like you hear here. That’s quite remarkable.”
I agree that it’s remarkable. However, when watching the full video of Disco, one notices that this bird starts to sound a bit … well, not human-like. Here’s more from Disco https://www.youtube.com/watch?v=EFJeY9fL5tk:
“Baby disco. Give me a kiss [kiss sound]. Gonna get that belly. Gonna get that belly. Gonna get that belly. Gonna get that belly. Gonna get that belly. Disco, meet the Disco, he’s a dabba-dabba do-bird. [unintelligible]. Bird bird. [beat-boxing]. I’m a parakeet – bird to your mother. What seems to be the profficer. I am not a crook – my name is Disco. I’m a parakeet. What seems to be the problem officer. I am not a crook – my name is Disco. I’m a parakeet. [beat-boxing]. I’m a parakeet – bird to your mother. Nobody … [distracted]. Disco, meet the Disco, he’s a dabba-dabba do-bird. [unintelligible]. Bird bird. Disco budgie in the house tonight. Eat some millet and have a good tide [sic]. Domo arigato Mr. Roboto. [chirp] Oh there goes Tokyo. Go go Godzilla. Uh shadoobay shattered, shattered. Shadoobay shattered, shattered. Domo arigato Mr. Roboto. Disco budgie in the house tonight. Eat some millet and have a tide. Domo arigato Mr. Roboto. Domo. Domo arigato Mr. … [distracted]. Domo arigato Mr. … [distracted]. Beat box budgie. Domo arigato Mr. Roboto. Domo. Domo arigato Mr. … [distracted]. Don’t just stand there, bust a move. [unintelligible] and prosper. [unintelligible] and prosperd. [unintelligible]. Where’s the beef. Shadoobay shattered, shattered. Shattered. Domo arigato Mr. Roboto. I’m Disco and I know it. I’m Disco and I know it. What did Momma say. Nobody puts baby bird in corner. Never shake a baby bird. Never shake a baby bird. Nobody puts baby bird in corner. Never shake a baby bird. What seems to be the problem officer? I am not a crook – my name is Disco. I’m a parakeet. Mean poopachine. What did Momma say. There goes Tokyo. Go go Godzilla. [chirping] Disco. There goes Tokyo. Go Godzilla. There goes Tokyo. Go go Godzilla. There goes Tokyo. Go gox budgie. Baby got bax budgie. [unintelligible] that belly. Gotta get that belly. Gotta get that belly. What you talking about Disco. Ooh la la, give me a big kiss. Ooh la la, give me a big kiss. Ooh la la, give me a big kiss. [kiss sound].
Disco is not talking in a way that a healthy human does (in fact, he sounds a bit like a patient with fluent aphasia). He is ejecting the patterns of speech that he has perceived out with minimal recombination of underlying meaningful elements. This is not a particularly novel observation about parakeets. Even as phonologically complex as bird vocalizations are compared to the relatively much simpler dog barks, it is clear that they do not differ substantially in the complexity of meaning that Gary Larson attributed to dog barks.
One last comment. I believe that some of the apparent conflict in positions regarding the degree of continuity of human language with vocal learning in non-human animals stems from the fact that Chomsky and colleagues often endorse the neurobiological model of syntax of Friederici (2012; 2017), in which the core of the human language syntax is localized to the posterior portion of Broca’s area. If the core of human language is localized to Broca’s area, I can very well see why Erich and colleagues would point out that this core is not specific to humans and very likely draws on very similar computational systems that are found in other organisms and domains such as vocal learning. As I have said, I agree 100% on this point (see Matchin, 2017). I think particular subregions of Broca’s area are involved in working memory and sequencing operations particularly important for language production that are heavily conserved across species. I (and many others) have pointed out that the Friederici model of syntax in Broca’s area is incorrect (Matchin et al., 2014; Matchin, 2017). I am currently working on a paper with Greg Hickok that proposes a theory in which the core of combinatorial syntax and semantics is localized to the middle-posterior superior temporal sulcus. When that paper comes out, I would be very interested to hearing Erich’s thoughts and having another productive debate and/or discussion.
Follow me on twitter: https://twitter.com/wmatchin, or check out my website: https://www.williammatchin.com/.
Berwick, R. C., & Chomsky, N. (2015). Why only us: Language and evolution. MIT press.
Berwick, R. C., Beckers, G. J., Okanoya, K., & Bolhuis, J. J. (2012). A bird’s eye view of human language evolution. Frontiers in evolutionary neuroscience, 4.
Blank, I., Balewski, Z., Mahowald, K., & Fedorenko, E. (2016). Syntactic processing is distributed across the language system. Neuroimage, 127, 307-323.
Bolhuis, J. J., Tattersall, I., Chomsky, N., & Berwick, R. C. (2014). How could language have evolved?. PLoS biology, 12(8), e1001934.
Emmorey, K., Mehta, S., & Grabowski, T. J. (2007). The neural correlates of sign versus word production. Neuroimage, 36(1), 202-208.
Feenders, G., Liedvogel, M., Rivas, M., Zapka, M., Horita, H., Hara, E., ... & Jarvis, E. D. (2008). Molecular mapping of movement-associated areas in the avian brain: a motor theory for vocal learning origin. PLoS One, 3(3), e1768.
Fridriksson, J., Yourganov, G., Bonilha, L., Basilakos, A., Den Ouden, D. B., & Rorden, C. (2016). Revealing the dual streams of speech processing. Proceedings of the National Academy of Sciences, 201614038.
Hauser, M. D., Chomsky, N., & Fitch, W. T. (2002). The faculty of language: what is it, who has it, and how did it evolve?. Science, 298(5598), 1569-1579.
Hickok, G., & Poeppel, D. (2007). The cortical organization of speech processing. Nature Reviews Neuroscience, 8(5), 393-402.
Jarvis, E. D. (2004). Learned birdsong and the neurobiology of human language. Annals of the New York Academy of Sciences, 1016(1), 749-777.
Klima, E., & Bellugi, U. (1979). The signs of language. Harvard University Press.
Leonard, M. K., Ramirez, N. F., Torres, C., Travis, K. E., Hatrak, M., Mayberry, R. I., & Halgren, E. (2012). Signed words in the congenitally deaf evoke typical late lexicosemantic responses with no early visual responses in left superior temporal cortex. Journal of Neuroscience, 32(28), 9700-9705.
Leroy, F., Cai, Q., Bogart, S. L., Dubois, J., Coulon, O., Monzalvo, K., ... & Lin, C. P. (2015). New human-specific brain landmark: the depth asymmetry of superior temporal sulcus. Proceedings of the National Academy of Sciences, 112(4), 1208-1213.
MacSweeney, M., Woll, B., Campbell, R., McGuire, P. K., David, A. S., Williams, S. C., ... & Brammer, M. J. (2002). Neural systems underlying British Sign Language and audio‐visual English processing in native users. Brain, 125(7), 1583-1593.
Matchin, W., Sprouse, J., & Hickok, G. (2014). A structural distance effect for backward anaphora in Broca’s area: An fMRI study. Brain and language, 138, 1-11.
Matchin, W. G. (2017). A neuronal retuning hypothesis of sentence-specificity in Broca’s area. Psychonomic bulletin & review, 1-13.
Matchin, W., Villwock, A., Roth, A., Ilkbasaran, D., Hatrak, M., Davenport, T., Halgren, E. &
Mayberry, M. (2017). The cortical organization of syntactic processing in American Sign Language: Evidence from a parametric manipulation of constituent structure in fMRI and MEG. Poster presented at the 9th annual meeting of the Society for the Neurobiology of Language.
Mesulam, M. M., Rogalski, E. J., Wieneke, C., Hurley, R. S., Geula, C., Bigio, E. H., ... & Weintraub, S. (2014). Primary progressive aphasia and the evolving neurology of the language network. Nature Reviews Neurology, 10(10), 554-569.
Petitto, L. A. (1994). Are signed languages ‘real’languages. Evidence from American Sign Language and Langue des Signes Québécoise. Reprinted from: Signpost (International Quarterly of the Sign Linguistics Association), 7(3), 1-10.
Sandler, W., & Lillo-Martin, D. (2006). Sign language and linguistic universals. Cambridge University Press.