There are two new papers on the neurophysiological correlates of speech and language processing that are quite interesting. They are closely related to each other and are fun to read (and discuss) as a pair. Both compare the responses to intelligible versus unintelligible speech using neuronal oscillations as the metric. One group focuses on the gamma band, one on the theta band. Both papers do a terrific job motivating the study, and both show some nice analyses.
One paper is by Marcela Peña and Lucia Melloni and just appeared in the Journal of Cognitive Neurocience. Brain Oscillations during Spoken Sentence Processing, May 2012, Vol. 24, No. 5, Pages 1149-1164.
Marcela and Lucia used high-density EEG and employed a cross-linguistic design. They recorded from Spanish and Italian participants while they were listening to Spanish, Italian, or Japanese. The study derives from the perspective of 'binding by synchrony,' a position that continues to receive a lot of attention in systems and cognitive neuroscience - but is not yet as widely investigated in speech/language studies. The assumption is that when listening to a language that the listener understands (i.e. there is comprehension at the sublexical, lexical, syntactic, semantic levels), whatever neural signal reflects 'binding' across the populations that need to be coordinated will be enhanced in the intelligible conditions (i.e. Spanish for Spanish speakers, Italian for Italian speakers). What they observe is that the gamma band is selectively enhanced during the sentence when it is comprehended. (Their figures 1 and 5 tell the whole story.) They conclude that the low-frequency, theta activity tracks lower-level information, the (lower) gamma band reflects what happens in intelligible speech, i.e. binding of higher level representations. Overall, this supports a binding-by-synchrony style view for language processing.
And a slightly different perspective/conclusion ...
The other paper is by Jonathan Peelle, Joachim Gross, and Matt Davis and is in Cerebral Cortex. Phase-Locked Responses to Speech in Human Auditory Cortex are Enhanced During Comprehension, doi:
10.1093/cercor/bhs118.
Jonathan, Joachim, and Matt used MEG and presented listeners with vocoded speech that was either intelligible (16 channels), partially intelligible (4 channels), or unintelligible (1 channel). They also presented a 4 channel unintelligible condition (spectrally rotated). They calculate a quantity they call 'cerebro-acoustic coherence', used to quantify the relation between the envelopes of the stimuli and the low-frequency (4-7 Hz) neural response. They show that when a sentence is intelligible, the coherence is systematically higher. (Their figures 1 and 4 pretty much tell the story.) Of special interest is their observation that there is an MTG-centered, left lateralized activation when comparing 4 channel intelligible versus unintelligible stimuli. This adds further support to the key role MTG plays for (lexically mediated) intelligibility. Moreover, their data challenge what some of my collaborators and I have argued (theta tracking is acoustic; e.g. Howard & Poeppel 2010, 2012 etc.)
A little whining, some small regrets ... There are three things I would like to hear about from Marcela and Lucia. (i) Why not analyze the low frequency response components in more detail? (ii) Why not look at the phase, and focus solely on power? (iii) Why did the gamma band response in the intelligible conditions not start till 1000 ms after the sentence has started? Presumably the first second of a sentence is also understood ... And from Jonathan, Joachim and Matt, I would have liked to know (i) Why no analyses of the higher frequencies, e.g. the low gamma band? (ii) Why no analyses of power? (iii) Why are the behavioral data for 4 channels (fig 1E) so different from the rest of the literature using such materials (Shannon, Drullman etc.)?
Notwithstanding a little complaining, these are very cool papers! So, if we could have these two articles date, and have them generate a paper-offspring, baby paper, I could imagine seeing some interesting alignments between theta and gamma that reflect intelligibility. Maybe we need both regimes of neuronal oscillations to generate usable representations ....