Monday, November 28, 2016

Revisiting the relation between speech production and speech perception: Further comments on Skipper et al.

Continuing the "discussion" of Skipper, Devlin, and Lametti's (SDL) recent and in my opinion badly misguided review of the relation between speech perception and production, let's consider this quote on page 84:
Miceli, Gainotti, Caltagirone, and Masullo (1980) found a strong relationship between the ability to produce speech and discriminate syllables in 69 fluent and nonfluent aphasics. Specifically, contrasts between groups with and without a phonemic output disorder showed that patients with a disorder were worse at discriminating phonemes, particularly but not limited to those distinguished by place of articulation
This is misleading. "Ability to produce speech" in this paper is defined basically as a presence of phonemic paraphasias in the absence of articulatory difficulty, which will tend to identify fluent aphasics with posterior lesions like Wernicke's and conduction aphasia.  This is a rather odd measure of "ability to produce speech" but nonetheless the article reports that patients with "phonemic output disorder" (POV+) so defined were compared with those without (POV-) on a syllable discrimination task and the POV+ group performed worse, which is what SDL note and call a "strong relationship." However, when Miceli et al. dug deeper to ask whether there was a correlation between severity of POV+ and severity of syllable discrimination deficit, no relation was observed.  More importantly, Miceli et al. go on to report dissociations between POV+ and comprehension measures, which is the point I've been making for quite a while.  

Thus, rather than providing evidence for a relation between speech output ability and the ability to perceive speech, the report shows (i) that the severity of the production deficit is not correlated with the severity of performance on a syllable discrimination task and (ii) the presence of production deficits (POV+) dissociates from measures of auditory comprehension.  

SDL also claim in the same section that "Both children and adults with cerebral palsy have been shown to perform worse on phoneme discrimination and this is often related to articulatory
abilities" citing in support of their claims Bishop et al. 1990.  This is one of my favorite studies because it clearly shows how incredible important task selection is to understanding speech perception.  It is true that people with cerebral palsy performed worse on syllable discrimination tasks but that the same participants had NO IMPAIRMENT relative to controls when the same speech sounds were comprehended (using a very cool task) rather than discriminated.  See my blog post about the Bishop et al. study here.

SDL also use Parkinson's disease--"a degenerative movement disorder that results in reductions in premotor, SMA, and parietal cortex metabolism, linked to the basal ganglia"--as evidence that motor impairment affects speech perception.  I've addressed these findings previously in the Myth of Mirror Neurons, but noticed a new paper in the citation list by Vitale et al., so I looked it up and noted a fascinating conclusion from this large scale (N>100) study.  Here's an extended quote from the abstract:
Our patients with Parkinson's disease showed age-dependent peripheral, unilateral, or bilateral hearing impairment. Whether these auditory deficits are intrinsic to Parkinson's disease or secondary to a more complex impaired processing of sensorial inputs occurring over the course of illness remains to be determined. Because α-synuclein is located predominately in the efferent neuronal system within the inner ear, it could affect susceptibility to noise-induced hearing loss or presbycusis. It is feasible that the natural aging process combined with neurodegenerative changes intrinsic to Parkinson's disease might interfere with cochlear transduction mechanisms, thus anticipating presbycusis
So people with Parkinson's disease have peripheral hearing loss.  It seems to me that might be a better explanation of the speech perception deficit than damage to the motor system as SDL try to argue.    

Friday, November 18, 2016

How do chinchillas, pigeons, and infants perceive speech? Another Comment on Skipper et al.

There's a nagging problem for any theory that holds that the motor system is critical for speech perception: critters without the biological capacity for speech production can be trained to perceive speech remarkably well.  Here's the graph from Kuhl & Miller showing categorical perception in Chinchillas:

This, I would say, was a major factor in what doomed the motor theory of speech perception and why speech scientists like me had abandoned the idea in any strong form by the time mirror neurons came on the scene.

A reasonable response to data such as these is to acknowledge that speech perception can happen with the auditory system alone. With that as our limiting case, if you want to explore the role of the motor system in speech perception, it will have to be a much more nuanced contribution, e.g., that the motor system somehow contributes a little but under some circumstances.  I've acknowledged this possibility. From Hickok et al. 2009:
the claim for the ‘necessity’ of the motor system in speech perception seems to boil down to 10 percentage points worth of performance on the ability to discriminate or judge identity of acoustically degraded, out of context, meaningless syllables – tasks that are not used in typical speech processing and that double-dissociate from more ecologically valid measures of auditory comprehension even when contextual cues have been controlled. This suggests a very minor modulatory role indeed for the motor system in speech perception. 
Ok, so that's a little snarky for an acknowledgement.  Here's another that's more measured from Hickok et al. 2011:
we propose that sensorimotor integration exists to support speech production, that is, the capacity to learn how to articulate the sounds of one’s language, keep motor control processes tuned, and support online error detection and correction. This is achieved, we suggest, via a state feedback control mechanism. Once in place, the computational properties of the system afford the ability to modulate perceptual processes somewhat, and it is this aspect of the system that recent studies of motor involvement in perception have tapped into.

 I'm not sure I agree with myself any more as the evidence for a modulatory role under ecologically valid listening conditions is extremely weak.  For example, Pulvermuller and colleagues took the task issue complaints seriously and performed a TMS study using comprehension as their measure. This study failed to replicate the effect on accuracy of speech perception found with discrimination or identification tasks but did find an RT effect that held for some sounds and not others.  See my detailed comments on this study here.

But back to SDL and chinchillas.  What is their take on these facts? Here's what they say:
Though these categorical speech perception studies are often revered because they suggest the reality of speech units like phonemes, they have been criticized. Problems include that the tasks assume the units under study and that within category differences are actually readily discernible and meaningful
I agree with both the idea that these studies don't necessarily imply that the phoneme (or segment) is a unit of analysis in perception or that listeners can't hear within category differences (see Massaro's critiques of categorical perception).  But that doesn't make the similarity between the human and chinchilla curve evaporate.  No matter what unit is being analyzed or whether within-category differences can be detected under other task conditions, it still remains that chinchilla's can hear subtle differences between speech sounds. SDL's critique is tangential.

SDL then turn to a line of argumentation that makes no sense to me. The write of the claim from animal work that
Neurobiologically, the argument is unsound because, in the early work frequently used to support this argument... the brain was not directly observed. 
 The claim is not neurobiological.  It is functional.  Neurobiology doesn't matter for the structure of the argument: if an animal cannot produce speech yet can perceive it, it follows that you don't need to be able to produce speech to perceive it.  Period.  But let's read on:
Yet it has been suggested that premotor cortex is involved in processing sounds that we cannot produce in ways that make use of the underlying computational mechanisms that would also be involved in movement
So this implies that motor plans for non-speech actions are sufficient for perceiving speech.  So, assuming that SDL buy into the broader claims that action understanding for say grasping and speech is achieved via motor simulation, what they are actually saying is that when a chinchilla perceives a speech sound it resonates with a motor network for some non-speech actions (biting?) and this somehow results in the correct perception of the speech sound (for which there is no motor plan) instead of the motor plan that it actually resonated with. Hmm. Isn't is a bit more parsimonious to assume that the two sounds are acoustically different and that the chinchilla's auditory system can detect and represent that difference?

If we are going to accept a hypothesis that deviates substantially from parsimony, we're going to need some very strong evidence.  SDL highlight the fact that premotor areas of nonhuman primates activate during the perception of sounds they cannot produce. But again there is a more parsimonious explanation. The brain needs to map all sorts of perceptual events onto action plans for responding to those events. If you see a snake coiling and rattling its tail, you need to map that percept onto movement plans for getting out of the way. Presumably, your premotor cortex would be activated by that percept even though you have no motor programs for coiling and tail rattling. The same mechanism can explain the data SDL mention.

SDL also highlight that "Bruderer et al. (2015) showed that perturbing the articulators of
6-month old infants disrupts their ability to perceive speech sounds."  But the study is confounded by differences in the amount of distraction that the methods of perturbation likely causes.  Here are the teethers they used.  Which one would you guess is more annoying to the infant? Once you've made your guess, go read the paper and see which one caused the speech perception decline. 


In sum, SDL make a convoluted argument to salvage the idea that the motor system is responsible for the perception of speech even in animals and pre-lingual infants.  A much simpler explanation exists: auditory speech perception is achieved by the auditory system, which is present in adult humans, prelingual infants, and chinchilla's, all of which can perceive speech surprisingly well. 

Wednesday, November 16, 2016

On the hearing ear and speaking tongue: Comments on Skipper, Devlin & Lametti - 1

Skipper, Devlin, and Lametti (SDL; 2017, Brain & Language, 164:77-105) review the evidence regarding the role of the "motor system" in speech perception and conclude that it is ubiquitously involved.  More specifically, they conclude:
Results are inconsistent with motor and acoustic only models of speech perception and classical and contemporary dual-stream models of the organization of language and the brain. Instead, results are more consistent with complex network models in which multiple speech production related networks and subnetworks dynamically self-organize to constrain interpretation of indeterminant acoustic patterns as listening context requires. [from abstract]
 I disagree.  What I'd like to do here is spark a discussion of this paper, hopefully involving input from the authors, that highlights the points of disagreement.  This will probably involve several posts. I'm going to start here with SDL's section on "Rethinking the question."

SDL wish to deconstruct the question of the motor system's role in speech perception, which is a laudable goal. In doing so they argue,
...that the question and indeed the entire debate is misleading due to the complexity of the neurobiology of speech production and the dynamic nature of speech perception.
 As the statement indicates, their argument comes in two parts:

  1. The network involved in speech production is complex and not, for example, restricted to Broca's area. In other words, identifying what counts as "the motor system" is hard.
  2. Speech perception is context dependent, we don't even know what the unit of analysis is; i.e., it is dynamic. In other words, identifying what counts as "speech perception" is hard (and shifting). 
I'm sympathetic to these characterizations, but the fact that it is a complicated system doesn't mean you can't ask a precise question or that a precise question hasn't been asked (if you look closely).  Take the first argument.  SDL complain that "Very often the motor system is discussed only in reference to Broca's area."  True, some studies, including at least one of my own, focus on Broca's area.  In my case, I was specifically addressing claims about the role of Broca's area doing something for speech perception, so it made sense. But not all studies do, and my claims about the "motor system" have been more general, often referring to the entire dorsal stream, as SDL's quote from Hickok & Poeppel demonstrate. Moreover, one doesn't have to understand the neuroanatomy of the motor system to study its role in perception; you can do it functionally by identifying instances of functional motor speech disruption and looking at its effects on perception.  This was my approach in a Wada deactivation study and in a host of studies, some of which are reviewed here.

Regarding the second argument, SDL write,
what is meant by ‘‘speech perception” is typically ill defined. It is often discussed in the neurobiological literature as if it is a static operation, the result of which are minimal categorical units of speech analysis, phonemes or syllables, from which we can then build words and put those words into sentences. This assumption is reflected in the way speech perception and the brain is studied using primarily isolated speech sounds like ‘‘da” and ‘‘ba”
 I agree that many researchers study "speech perception" using primarily isolated speech sounds like "da" and "ba" and I agree that this is an impediment to the field.  In fact, SDL's complaint sounds extremely familiar to me.  Here is quote from Hickok & Poeppel 2000
Part of this confusion stems from differences in what one means by ‘speech perception’ and how one tests it behaviorally. Psychological research on speech perception typically utilizes tasks that involve the identification and/or discrimination of ‘sub-lexical’ segments of speech, such as meaningless syllables, and many neuropsychological and functional imaging studies have borrowed from this rich literature
Another from Hickok & Poeppel 2004
The upshot is that the particular task which is employed to investigate the neural organization of language (that is, the mapping operation the subject is asked to compute) determines which neural circuit is predominantly activated. [emphasis original to the published paper, cuz it's THAT important and people tend to miss the point]
And again from Hickok & Poeppel 2007
Many studies using the term ‘speech perception’ to describe the process of interest employ sublexical speech tasks, such as syllable discrimination, to probe that process. In fact, speech perception is sometimes interpreted as referring to the perception of speech at the sublexical level. However, the ultimate goal of these studies is presumably to understand the neural processes supporting the ability to process speech sounds under ecologically valid conditions, that is, situations in which successful speech sound processing ultimately leads to contact with the mental lexicon and auditory comprehension
We have been harping on this point for 16 years and repeatedly argued that the functional anatomy of "speech perception" varies by task: if you look at ecologically valid tasks (speech perception in the wild) you see a ventral temporal basis; if you look at typical laboratory "sub-lexical" tasks, you see a dorsal, frontoparietal basis.  This is why in Hickok & Poeppel 2007 we stated clear definitions of the speech terms we used:
In this article we use the term ‘speech processing’ to refer to any task involving aurally presented speech. We will use speech perception to refer to sublexical tasks (such as syllable discrimination), and speech recognition to refer to the set of computations that transform acoustic signals into a representation that makes contact with the mental lexicon.
It is odd that SDL complain about a lack of terminological clarity in the field (#BeenThereSaidThat), call for the abandonment of the model that was developed precisely to remedy the problem they complain about, and then fail to adhere to their own advice to worry about what counts as "speech perception" (they go on to cite a wide range of tasks, mostly "sublexical," to support their claims).  In fact, according to SDL's imprecise definition of speech perception (any task counts), Hickok & Poeppel have already made their argument for the role of the motor system in speech perception.  E.g., from the abstract of Hickok & Poeppel 2000: "Tasks that require explicit access to speech segments rely on auditory–motor interface systems in the left frontal and parietal lobes."  So, yes, if you allow syllable discrimination to count as "speech perception," the motor system is definitely involved.  Here is a series of posts I wrote in 2007 on this topic here, herehere, and here

SDL have not redefined the problem, they have rediscovered a known problem, one that has been at least partially solved, and then ignored the solution.


Assistant or Associate Professor Department of Communication Sciences and Disorders The Pennsylvania State University

We are a nationally-ranked program that, for more than 80 years, has been
preparing highly-qualified professionals for clinical, academic, and
research careers addressing prevention and rehabilitation of speech,
language, and hearing problems (http://csd.hhd.psu.edu).  Our tenured,
tenure-track faculty have active programs of research within a culture that
encourages engagement with students in our undergraduate, master’s, and
doctoral programs.  The responsibilities of the position are:  establish or
continue a line of research in one or more of the following specialty
areas - neurofunction of speech production/disorders, neurofunction of
swallowing disorders, disorders of articulation/phonology, child apraxia of
speech, fluency disorders, voice disorders, language development in
individuals with hearing impairment or cochlear implants, aural
(re)habilitation, or auditory processing development/disorders; teach
undergraduate and graduate courses consistent with the candidate’s
expertise; supervise undergraduate and graduate (M.S./Ph.D.) research;
provide service to the Department, College, and University, and; contribute
to the clinical aspects of the program.  Opportunities exist to make use of
the magnetic resonance imaging facility and for interdisciplinary research
collaboration with colleagues from multiple Centers, Consortia and Labs at
the University Park Campus and at Penn State’s College of Medicine at
Hershey.  Numerous departments including Biobehavioral Health, Psychology,
Kinesiology, Bioengineering, and Human Development and Family Studies are
potential partners.  Requires a Ph.D. in a relevant area with an emerging
and/or active thematic program of research and scholarship that has the
potential to attract external funding.  Previous teaching experience and/or
post-doctoral experience desired. CCC-SLP or CCC-A preferred.  The ability
to work effectively with diverse students, faculty, and staff is required.
Salary will be competitive, commensurate with background and experience.  An
attractive benefits package is available. Review of applications will begin
January 1, 2017 and continue until a suitable candidate is identified.
Candidates will need to upload a resume/CV, copies of transcripts,
statements of teaching and research experiences and interests, up to three
relevant publications, and a list of references.  Position will begin Fall
2017 or as negotiated.  Apply online at https://psu.jobs/job/67719

CAMPUS SECURITY CRIME STATISTICS: For more about safety at Penn State, and
to review the Annual Security Report which contains information about crime
statistics and other safety and security matters, please go to
http://www.police.psu.edu/clery/, which will also provide you with detail on
how to request a hard copy of the Annual Security Report.

Penn State is an equal opportunity, affirmative action employer, and is
committed to providing employment opportunities to all qualified applicants
without regard to race, color, religion, age, sex, sexual orientation,
gender identity, national origin, disability or protected veteran status.

Wednesday, November 9, 2016

Postdoctoral research position in Vocal Learning in London!

Vocal Communication Laboratory, Royal Holloway, University of London
ESRC-funded postdoctoral research position in Vocal Learning
Application deadline: 11th November 2016

Applications are invited for the post of Research Associate to work with Dr Carolyn McGettigan on the project “Vocal Learning in Adulthood: Investigating the mechanisms of vocal imitation and the effects of training and expertise”, which is funded by the Economic and Social Research Council. The project will investigate the behavioural and neural correlates of the imitation of vocal sounds in control and expert (e.g. beatboxers, singers) populations, using magnetic resonance imaging (MRI) of the brain and the vocal tract.

Applicants should hold a PhD in Psychology, Neuroscience or a related discipline (e.g. Experimental Phonetics, Speech Science, Medical Physics). They must have previous research experience with neuroimaging using MRI and show a capacity to use computational methods for cognitive neuroscience research. Expertise in auditory processing and vocal communication research is highly desirable.

This is a full time post, available from January 2017 or as soon as possible thereafter for a fixed term period of 12 months in the first instance.

This post is based in Egham, Surrey where the College is situated in a beautiful, leafy campus near to Windsor Great Park and within commuting distance from London.

For an informal discussion about the post, please contact Dr Carolyn McGettigan (Carolyn.McGettigan@rhul.ac.uk or +44 (0)1784 443529). For more information about the activities of the Royal Holloway Vocal Communication Laboratory, visit the lab website: www.carolynmcgettigan.com.

To view further details of this post and to apply please visit https://jobs.royalholloway.ac.uk. Interested applicants should complete the online application form and submit (i) a full curriculum vitae with a list of publications and (ii) a 1-page statement of past and current research activities and areas of interest. The Human Resources Department can be contacted with queries by email at: recruitment@rhul.ac.uk.

Please quote the reference: 0916-321


Interview Date: To Be Confirmed

Wednesday, November 2, 2016

Two ECoG Post Doc positions UT Houston (Tandon Lab) & with an international team including Crone, Hickok, Dehaene

Two postdoctoral research positions are available at the NeuroImaging and Electrophysiology Lab (Tandon Lab) in the Department of Neurosurgery at the University of Texas Medical School in Houston. This position is funded by a recently awarded BRAIN Initiative U01 grant for which Dr. Tandon is the PI. The project uses use electro-corticographic (ECoG) recordings on a large cohort (n=80) to evaluate psycho-linguistic models of reading and speech production with the goal being to create network level representation of language. Collaborators on the project with whom the post-doc will work closely are Nathan Crone (Hopkins), Greg Hickok (UCI), Stanislas Dehaene (College de France), Xaq Pitkow (Baylor) and Josh Breier (UT Houston). 

Project Description: 
This is a close multi center collaboration that brings together investigators with established track records in intracranial EEG (iEEG) recordings, neuroscience of language and computational neuroscience to better understand the uniquely human behavior of reading and producing language. More details about the U01 grant are online at NIH Reporter. The post-doc will benefit from a close interaction with several experts in the fields of reading, semantics, and speech production.  

Post-doc Responsibilities:
The selected individuals are expected to be highly motivated, team players who have the passion to study cognitive processes using direct recordings in humans.  They will be responsible for 1) optimizing and refining paradigms for use in the project, 2) data collection in the epilepsy monitoring unit and in the MRI scanner, 3) ECoG data analysis using a analysis pipelines existent in the lab and via the development of innovative strategies, and 4) data presentation at conferences, manuscript and grant writing.


Requirements:
The selected individuals must have a Ph.D. in one or more of the following - neuroscience, psychology, cognitive science, mathematics, electrical engineering or computer science. Previous experience in neural time series data analysis or functional imaging studies of reading, speech production are highly desirable. Crucial is the ability to independently code in either or all of the following – MATLAB, R or Python. Given the multiple unpredictable variables and privacy issues around data collection in human patients, the individual must possess high ethical and professionalism standards, be able to adapt to a changing environment, reorganize schedules dynamically, and work with tight deadlines. The individual must possess the ability to work effectively independently, yet collaborate effectively on projects with multiple investigators. A strong publication record and excellent prior academic credentials are highly desirable.