Comments on Talking Brains: Comprehension, intelligibility, neural oscillations: two interesting new papers

Greg: your question is fair (if not new), but I am...

2012-05-24T10:59:41.392-07:00

Greg: your question is fair (if not new), but I am not the right person to answer it. Actually Lucia Melloni, the coauthor of one of the papers, has thought and written about this a great deal - and maybe we could persuade her to summarize some of the main issues for us. (Lucia: what do you say?)

There are a bunch of interesting papers and reviews, and of these I am especially partial to the work of Pascal Fries.

If you *really* want to dig into this stuff, a good starting point is a special issue of Neuron from 1999 called, creatively, The Binding Problem.

I think you are right to wonder - I certainly wonder about the logic of the issue - but the neural responses we can measure in the context of these studies (often oscillations) are certainly useful, regardless of their ultimate interpretation.

Thanks for the summary of these papers David. Now...

2012-05-23T15:22:00.083-07:00

Thanks for the summary of these papers David. Now I don't have to read them!

You say "binding-by-synchrony". Do you really think synchrony is causing the binding? Or is it a consequence of the binding? Does it even matter?

I normally just think of oscillation synchrony as a reflection of the fact that networks are talking to each other, just like say a negative deflection is an ERP reflection of activation of auditory cortex to an acoustic event. We don't say "auditory activation by negative deflection". Is there more to synchrony? What's the evidence?

hi David, All of your comments are very well take...

2012-05-23T15:06:24.989-07:00

hi David,

All of your comments are very well taken! Some quick responses:

1) Regarding the frequencies in which we looked for cerebro-acoustic coherence, our goal was to look specifically at theta oscillations associated with the dominant acoustic components of the speech signal, which lined up well with the largest overall phase-locking we saw in the MEG data (our Figure 2). Because we didn't see any hints of significant phase locking at higher frequencies, we didn't explore these in detail, but I agree it is a good idea.

2) Power analyses to complement our phase analyses have been on our list for quite some time now but did not make it into the paper. These seemed less urgent to me because of the nice work from other groups (cough Luo & Poeppel 2007 cough) which was fairly convincing in showing the importance of phase (not power) in the responses we were looking at, along with some of the really nice work in nonhuman primates (Schroeder, Lakatos, et al.). Nevertheless, I completely agree that these are a sensible thing to look at and would be helpful.

3) In fact, I don't think the behavioral data for the 4 channel vocoded condition are odd at all. As you know the intelligibility of vocoded speech depends on numerous factors including the number of channels, their spacing, the frequency range, SNR, amount of exposure, etc., not to mention various digital signal processing minutia (envelope filter frequency, parameters of the filters, etc.). In our hands we found ~ 30% correct word report for 4 channel vocoded sentences. This is not all that different from a little under 20% correct in Davis & Johnsrude (2003). Shannon et al. (1995) show much higher performance (Fig. 2), but their upper frequency was 4 kHz (as opposed to our 8 kHz), meaning that there was significantly more spectral detail under 4 kHz. (Not to mention the large amount of training listeners received.) I could go on, but I think we're actually not all that different than the rest of the literature.

For what it's worth I'm pretty sure our paper is single and open to dating. I agree that multiple frequencies of oscillations getting together seem like a good idea.

References:

Davis MH, Johnsrude IS (2003) Hierarchical processing in spoken language comprehension. J Neurosci 23:3423-3431.

Shannon RV, Zeng F-G, Kamath V, Wygonski J, Ekelid M (1995) Speech recognition with primarily temporal cues. Science 270:303-304.