Sunday, March 30, 2014

A double dose of ECog – two 2014 papers on speech


A recent paper in Nature and a recent paper in Science provide ECog evidence for dorsal stream function and STG function, respectively.      

The first paper, “Sensory–motor transformations for speech occur bilaterally,” is from my NYU colleague Bijan Pesaran’s lab; the first author is Greg Cogan, a post-doc with Bijan. The paper tackles the important question of how dorsal stream structures implement sensory–motor transformations, an issue that Greg Hickok and I have speculated about (and Greg H. has worked on extensively). This rich paper reports a bunch of cool findings worth reading and studying. One of the strong claims – the part of the data providing the title – concerns the bilateral nature of (those parts of) the dorsal stream underpinning sensory-motor transformations for speech. Previous work has argued that output-related dorsal-stream processing is lateralized, certainly much more strongly than ventral stream areas/functions. I still find that position on the right track (cf. Hickok & Poeppel 2007), and I derive some special frisson from the fact that Greg Cogan, the co-architect of this counter-argument, was my graduate student and is an important collaborator. The data are the data – so it’s now important to figure out the why/how/what/when of these two dorsal streams. I am no apologist for lateralization in speech, but these data certainly present a new interpretive challenge. Speculations, ideas, data welcome.    


The second paper, “Phonetic Feature Encoding in Human Superior Temporal Gyrus,” is from Eddie Chang’s lab at UCSF and is spearheaded by Nima Mesgarani (now faculty at Columbia University in the EE department). Over the years, the evidence has steadily accumulated that STG is the ‘home’ of acoustic-phonetic perceptual analysis. Previous ECog data, including stimulation data, for example by Dana Boatman, Nathan Crone, and colleagues, has been strong evidence for STG (e.g. for review, Boatman 2004, http://www.ncbi.nlm.nih.gov/pubmed/15037126). This new work builds on those findings and demonstrates the sensitivity and selectivity of this region. From data acquired while the patients listened to spoken sentences (of numerous speakers), Nima et al. extracted activity profiles of the electrodes to all English phonemes. Phonetic features turn out to be an effective grouping principle (manner is especially prominent). Nima had done a similar project in Shihab Shamma’s lab in his dissertation work (I harassed him about it at his defense …) - but ferrets neither speak nor listen to all that much human speech … In this new work, the acoustic-phonetic encoding is elegantly described, providing some ways to think about the intermediate representations that could link input-related spectro-temporal processing to linguistic structures.      

No comments: