Thursday, January 21, 2010

Disentangling syntax and intelligibility -- Or how to disprove two theories with one experiment

I both love and hate a recent paper by Angela Friederici, Sonja Kotz, Sophie Scott, & Jonas Obleser titled Disentangling syntax and intelligibility in auditory language comprehension. The paper is in the "Early View" section of Human Brain Mapping.

Here's why I love it. There are a number of claims in the literature on the neuroscience of language that I disagree with. One is Sophie Scott's claim that speech recognition is a left hemisphere function that primarily involves anterior temporal regions. Another is Angela Friederici's claim that a portion of Broca's area, BA44, is critical for "hierarchical structure processing". In the study reported in this new paper, Friederici and Scott have teamed up and proven both of these claims to be incorrect. This I like.

What I hate about the paper is that the authors don't seem to recognize that their new data provide strong evidence against their previous claims, and in fact argue that it supports their view(s).

So what did they do? The experiment is a nice combination of the intelligibility studies that Scott has published and the syntactic processing studies that come out of Friederici's lab. It was a 2x2 design: grammatical sentences versus ungrammatical sentences x intelligible versus unintelligible (spectrally rotated) sentences.

What did they find? The intelligible minus unintelligible contrast showed bilateral activation up and down the length of the STG/STS, i.e., not just in the left hemisphere and not just anterior to Heschl's gyrus. This contradicts previous studies from Scott's group, particularly with respect to the right hemisphere activation, as the current paper correctly pointed out:

...the right-hemispheric activation in response to increasingly intelligible speech deviates from the original papers on intelligibility [Narain et al., 2003; Scott et al., 2000]. (p. 6)


In short, the primary bit of data that has been driving claims for a left anterior pathway for intelligible speech has been shown to be inaccurate. This is not terribly surprising as those previous studies were severely under powered.

Conclusion #1: the "pathway for intelligible speech" is bilateral and involves both anterior and more posterior portions of the STS/STG.

What about Broca's area and hierarchical structure building? In fairness, most of the paper was about the STG/STS and not about Broca's area, but the role of Broca's area was addressed and of course it is perfectly fair to use data from this study to address a hypothesis proposed by Friederici in other papers. If Broca's area is involved in hierarchical structure building, then it should activate during the comprehension of sentences, which surely are hierarchically structured. Thus, the intelligible (structured) minus unintelligible (unstructured) contrast should result in activation of Broca's area. Yet it did not. The contrast between intelligible and unintelligible sentences resulted only in activation in the superior temporal lobes.

Conclusion #2: Hierarchical structure building can be achieved without Broca's area involvement.

So in light of these findings, how does one maintain the view that intelligible speech primarily involves the left hemisphere and that syntactic (hierarchical) processing involves Broca's area? It all hinges on the response to those pesky ungrammatical sentences.

Here's the assumption on which their argument relies: syntactic processing is really only revealed during the processing of ungrammatical sentences. They don't state it in these terms, but this is what you have to assume for their arguments to work. Right off the bat we have a problem with this assumption. When you listen to an ungrammatical sentence, not only does this mess up syntactic processing, but it also increases the load on semantic integrative processes and who knows what other meta-cognitive processes are invoked by hearing a sentence like, "The pizza was in the eaten", which is an example of the kind of violation they used. In fact, one might even argue that processing an ungrammatical sentence causes the syntactic processing mechanism to shut down and instead crank up cognitive interpretation strategies. Thus rather than highlighting syntax, such a manipulation may highlight non-syntactic comprehension strategies!

So what happens when you listen to ungrammatical sentences and spectrally rotated ungrammatical sentences?

Ungrammatical sentences minus grammatical sentences (intelligible only) resulted in activation the left and right superior temporal lobe, Broca's area (left BA 44), and the left thalamus. So the "syntactic" effect is bilateral in the superior temporal lobe, but at least we now have Broca's area active.

The authors then took these seven ROIs defined in the two main contrasts (intell-unintell and gramm-ungramm), extracted percent signal change around the peaks and performed subsequent ANOVAs to assess interactions. These interactions are what really drives their argument. However, we now have another problem, namely that the data that defined the ROIs is not independent of the data that were subsequently analyzed using ANOVAs. We therefore can't be sure the reported effects are valid. Nonetheless, let's pretend they are see if the conclusions make sense.

Here is a graph of the interactions:



The claim here is that "syntax" (i.e., greater response to ungrammatical) and intelligibility (i.e., greater response to intelligible) significantly interacted only in the left hemisphere ROIs, and indeed in all of them, including BA 44 and the thalamus. Therefore these regions represent the critical network, according the Friederici et al., because they are responding to the syntactic features in intelligible speech and not merely acoustic differences which are present in the unintelligible speech as well. Something is very wrong with this logic even beyond the possible invalid assumption and analysis methods noted above.

Consider the response pattern in BA 44. Zero response to normal syntactically structured sentences (which presumably requires some degree of syntactic processing), significant activation to intelligible ungrammatical sentences, significant (or so it seems) activation to UNINTELLIGIBLE versions of grammatical sentences, and no activation to unintelligible versions of ungrammatical sentences. What possible syntactic computation could be invoked BOTH by a grammatical violation and unintelligible noises but not by grammatical sentences? And this pattern is considered part of the intelligible speech/syntactic processing system whereas the right anterior STS, which shows a very robust intelligibility effect and no obvious effect of violation is not. I would suggest instead that because the right STS area is actually responding to sentences and not just broken sentences or spectrotemporal noise patterns that the right STS is more likely involved in sentence processing.

In the end, Friederici et al.'s entire argument rests on (i) a possibly invalid assumption about their "syntactic" manipulation, (ii) a possibly contaminated statistical analysis, and (iii) a logically questionable definition of what counts as a region involved in the processing of these language stimuli.

The basic findings are extremely important though because they confirm that speech recognition and now the "pathway for intelligible speech" is bilateral and that Broca's area is silent during normal sentence comprehension and therefore is not involved in basic syntactic/hierarchical structure building.

References


Friederici AD, Kotz SA, Scott SK, & Obleser J (2009). Disentangling syntax and intelligibility in auditory language comprehension. Human brain mapping PMID: 19718654

Narain, C., Scott, S. K., Wise, R. J., Rosen, S., Leff, A., Iversen, S. D., & Matthews, P. M. (2003). Defining a left-lateralized response specific to intelligible speech using fMRI. Cereb Cortex, 13(12), 1362-1368.

Scott, S. K., Blank, C. C., Rosen, S., & Wise, R. J. S. (2000). Identification of a pathway for intelligible speech in the left temporal lobe. Brain, 123, 2400-2406.

Scott, S. K., & Wise, R. J. (2004). The functional neuroanatomy of prelexical processing in speech perception. Cognition, 92(1-2), 13-45.

6 comments:

Jonas said...

Greg, let me keep this as concise as possible. First, thank you for your coverage of this recent paper I had the honour to be senior author on. There is no such thing as bad publicity. On a more serious note, I cannot help but commenting on a few things.

In by now classic TB style, you make a few pretty polemic claims about this particular study. I hope I am – beside the usually vanities involved if it is about one’s own paper – good to comment on this, as I had neither strong feelings on BA 44’s job nor on right-hemisphere involvement before we started this, and I thought it would be a very straight-forward experiment. It was decidedly not about whether or not the “in the eaten” is the best grammatical manipulation conceivable, or whether rotated speech is the best acoustic baseline (it is surely better than what you call “noises”). It was about whether these previously widely–used manipulations could tell us anything about the relative involvement of the anterior STG and anterior STS in processing such either (say:) acoustic or (say:) structural manipulations.

I do maintain that it worked: It gave us so far best within-individual evidence on such a possible functional delineation (I am surprised that you did not comment on or buy into this main aspect of our data), and a lot more extra on the side. You have chosen to mainly single out results (and draw extended conclusions) on BA 44.

All this should best be resolved by great new experiments we should all go and run and submit to peer review. I will not further spam the TB comment form now.

Most importantly, though, I am somewhat upset by your only half-heartedly disguised claim that our data analysis, and hence our results, were invalid. Yes, the ROI ANOVAs were grounded on the same individuals and trials as the group analysis; however, we never claimed them to be independent. This is a difference. They show nothing that you could not in principal conclude from the SPM group analyses and the main results (left-hemispheric dominance for syntactic processes; STG peaks for syntax violation, while STS peaks for intelligibility) are all contained in the SPM group data also. The ROI ANOVAs and bar graphs are just another side of the same medal and they help to illustrate and clarify things; not more, but also not less. Had I refrained from including the ANOVA interaction test, I probably would have to argue with you over bar heights now.

tom said...

Hi Greg,

I haven’t anything yet to say about the syntactic element of this study, but I agree with the idea of left-lateralised processing for intelligible speech, and this paper doesn't change my mind. I’ll explain why below.

I think everyone would agree that listening to ‘speechy’ stuff activates the temporal lobes bilaterally. I also don’t think many people would have a problem with the idea that stimuli that are attended to are likely to be, in general, more activating. In order to interpret any sort of intelligible > unintelligible contrast, therefore, it’s crucial that the experimental design ensures that intelligible and unintelligible conditions are attended to equally. This is difficult as, under normal conditions, an intelligible sentence will engage a listener for longer than a sequence of meaningless noise of equal length. This study makes no attempt to ensure the unintelligible condition is attended to. It’s no surprise, therefore, that the subtraction intelligible > unintelligible shows greater activation bilaterally. What we are seeing here is the entire network for ‘speechy’ processing; i.e. acoustic/phonetic/phonological/semantic/speaker identity etc. etc. etc., much of which (though not necessarily all) will be bilaterally organised. This isn’t a true ‘intelligibility’ contrast, and may just be showing the effects of attentional modulation upon the auditory/speech system.

The Scott and Narain studies you mention that do show a left hemisphere preference for intelligible > spectrally rotated speech differ from the current study in two crucial ways. (1) In these studies, the in-scanner task was not completely passive. Subjects were asked to ‘try to understand’ the stimuli. This wouldn’t ordinarily be a particularly good way to ensure equal attention to both conditions, as subjects would be likely to ‘switch off’ after the first few seconds of spectrally-rotated speech when it becomes obvious that the speech cannot be understood. However in both of these studies, but not the current study, a ‘vocoded’ degraded speech condition was also present, and (2) crucially, subjects were trained to understand vocoded speech before they were scanned. My guess is that this training caused the subjects to subsequently attend more to both the vocoded degraded speech and the ‘unintelligible’ rotated speech, as they were now more likely to try harder to extract meaning from odd ‘speechy’ auditory stimuli. The intelligible and unintelligible conditions are therefore slightly better matched for attention and the left-hemisphere activation predominates here.

What happens when the intelligible and unintelligible conditions are attended to equally? Alex Leff and I published a study a year or so ago where we tried to make this happen (a discussion can be found elsewhere on your blog). We used short speech stimuli (two word idioms of ~2sec. length) and used time-reversed speech as a control. In the scanner, subjects were asked to decide the gender of the speaker for each phrase. The accuracy rates for the in scanner task showed that both conditions were attended to equally. Result - the intelligibility contrast was very strongly left-lateralised. We had 26 subjects, so power isn’t an issue. If processing intelligible speech engages a bilateral system we should have seen some right-hemisphere stuff, but we didn't.

Greg Hickok said...

Hi Jonas,

Thanks for your comment. As you have gathered I have no particular issue with the goals of experiment itself and I certainly believe, as I noted, that it provides important information. What I had a problem with, and what is apparently quite orthogonal to your own predispositions, was the interpretation of the findings relative to the existing proposals by Scott and Friederici. So this is what I focused on in my TB "review".

Regarding the grammatical versus ungrammatical manipulation let me say that I've never been a big fan. The reason is that I don't know (and I don't think anyone knows) what the response to ungrammatical sentences actually reflects. (One might get closer to understanding in the EEG domain, but with fMRI we don't have the time resolution.) These manipulations may not be tapping syntax, in which case, the logic of the entire interpretation is undermined. In this sense, I'm not sure what we have learned from the interaction of intelligibility and "syntax". Do you disagree with this? Do you really think the response to ungrammatical sentences is a pure measure of some syntactic computation? Or would you admit that it may be semantic, for example, or some other cognitive response to an anomaly?

Regarding the independence issue, I don't mean to upset anyone, but frankly, running statistical analyses on data that was used to select the data that one is analyzing is statistical double-dipping and therefore biased -- it should not have been published and I would urge you to re-do the analysis and publish an erratum so we can know whether the finding is real or not. It may be that had you used independent data things would have come out the same. But we simply don't know.

Greg Hickok said...

Hi Tom,

We have a forthcoming article in Cerebral Cortex in which we completely replicated the Scott/Narain experiments with noise vocoded speech, pre-training and all (but with more subjects) and found robust bilateral activity.

So let me ask you this: If the speech "intelligibility" system is so strongly left dominant, why doesn't destruction of the left temporal lobe cause word deafness?

tom said...

Hi Greg,

I'd need to take a look at your Cerebral Cortex article before I can comment on that, but if you ensured that all conditions were attended to equally and yet still saw no left hemisphere preference, then I promise I'll rethink my position.

In re. word deafness, this is a killer point and invalidates any theory of speech perception that claims that the right hemisphere is unable to extract any meaning whatsoever from the speech signal. You could have also mentioned your excellent WADA study to make the same point. Yes, it's true that extensive left-hemisphere damage doesn't put you on the floor in terms of speech comprehension, but it certainly does take you off the ceiling (>25% error rate on single words after left hemisphere deactivation according to your WADA results - this is shockingly bad performance). Decades of neuropsychology tells us this, and also that we don't usually end up with speech comprehension difficulties after similar damage to the right hemisphere (<5% error rate after WADA to right hem). Outside the lab, left hemisphere aphasic patients and their carers generally report problems with comprehending spoken discourse - this isn't the case for right hem. patients. The left temporal lobe is doing something perfectly that the right temporal lobe is pretty bad at, relatively speaking. Don't you think it's important to find out what that something is?

I'm convinced the right hemisphere is also processing useful aspects of speech-related information but, under lab conditions, the information that it processes doesn't seem to be critical for understanding intelligible speech in the form of spoken words and simple sentences.

Greg Hickok said...

You make some great points, Tom. Indeed, these are the issues we need to be focusing on: Both hemisphere's are contributing to speech perception/recognition, what exactly are their respective roles?

Unfortunately, some folks are still making sweeping statements about the left hemisphere organization of intelligible speech and/or speech perception. For example, here's a quote from Rauschecker & Scott (2009) Nature Neuroscience: “Speech perception and production are left lateralized in the human brain…” (p. 720). These kinds of statements perpetuate the myth that the left hemisphere is doing everything, all by itself, and this is bad for the field I think.

To qualify your comments on the Wada study, patients with a deactivated left hemisphere make phonological errors at a rate less than 10%. Off the ceiling to be sure, but still quite good for having NONE of "the pathway for intelligible speech".

So you say that, "The left temporal lobe is doing something perfectly that the right temporal lobe is pretty bad at, relatively speaking. Don't you think it's important to find out what that something is?" Yes, and in fact David and I have been trying to work this out and have proposed that various levels of analysis (phon, lexical-sem, syntactic) have different degrees of lateralization. I would love to be debating these finer points, but it has been very difficult getting folks to stop thinking that the only thing the right hemisphere can do is prosody.