Monday, August 14, 2017

Pathways for intelligible speech - setting the record straight

A couple days ago I tweeted the following regarding Sophie Scott's influential study on pathways for intelligible speech:


I followed this up saying that the finding--left anterior STS activation for intelligible compared to unintelligible speech--had not been replicated in subsequent studies, one of my own in particular, with larger sample size.

A number of people, including the lead author, felt it was an unfair attack and that the finding has held up to replication.  Let me clarify a few things.

My motivation: The motivation for this post actually came from something I saw on Twitter.  It was a gif of a person suddenly looking completely deflated; the caption read something like 'When you realize the study at the foundation of your entire theory has N=9'.  This made me think of the Scott study that has influenced a lot of models (e.g., Rauchecker and Scott 2009 write, "Speech perception and production are left-lateralized in the human brain") yet is quite underpowered by today's standards. 

Replication: I claimed that the study hadn't replicated.  In particular, a study from my lab with Kai Okada as lead was a direct replication of the Scott et al. study.  Like Scott et al. we found left aSTS activation for intelligible vs. unintelligible speech, but also found that the left aSTS was only the tip of the neural activation iceberg.  Not only did the contrast activate regions extending along the length of the STS, anterior to posterior, and did so bilaterally,


but we also found that using pattern classification methods, even Heschl's gyrus activity (that didn't show up in the intel vs. unintel contrast) could discriminate the two conditions.

So when I said the finding failed to replicate, what I meant was that subsequent findings failed to reproduce the pattern that the left aSTS was the only region selective for intelligible speech.  This is important, theoretically, because it speaks to differences between models, e.g., Hickok & Poeppel who argue that pSTS is part of the ventral stream versus Rauschecker & Scott who argue that the ventral stream flows only in the anterior direction from A1 (see their Figure 5).

My take on the role of Scott et al. 2000 in influencing the debate: I targeted the Scott et al. 2000 study because I believed that its emphasis on the left anterior STS in speech recognition is overly influential on current theory. My thinking was that if that original study had found what Okada et al. reported, we would be in theoretically more balanced place.

Criticism of my take: Some respondents took me to task on my critique. Here's an exchange with Johnathan Peelle


So maybe I'm being too harsh.  It's true that my belief regarding the influence of the Scott et al. 2000 study is based on my personal impression (supplemented by its citation frequency), which is no doubt biased because the left aSTS exclusivity is at odds with my own theory.  Maybe all those citations, or at least the recent ones, acknowledge that that study is not the whole picture.  Maybe they cite it just to provide evidence for *a* role of the left aSTS in speech recognition, which I agree with too, rather than *the* role.  Maybe it is cited only in connection with sentence level processing.

Evaluation of my assumptions: I decided to take a look at how the Scott et al. 2000 study is cited. This is not a systematic examination.  Basically I looked at only those studies published in 2016.  Here's what I found.

A number studies correctly and appropriately cite Scott et al., either methodologically or as one finding highlighting part of a bigger network (Peelle's papers are a good example). Several, however, still cite the paper as evidence for a left anterior core for speech recognition, sometimes as the only paper cited:

"Further, there is compelling evidence that sensory areas feed into a pathway running from posterior in the temporal lobe to anterior aspects (Scott et al., 2000)" - Santi et al. 
"The finding of left hemisphere dominance in the tract associations with the more linguistic domains is not unexpected and is highly consistent with previous findings (Rosen et al., 2011;  Scott et al., 2000)." - Bajada, et al. 
"It provides a mechanistic explanation for the preponderant role of the anterior temporal lobe in lexical semantics as delineated by studies examining speech comprehension." - Ries et al.
"Together, these results are consistent with studies indicating that phonetic recognition occurs in the left anterolateral superior temporal cortex (i.e. the ventral auditory stream) (Binder et al., 2000; Scott et al., 2000; Leaver and Rauschecker, 2010; DeWitt and Rauschecker, 2012)." -Alho et al.
Here's one citation from a dissertation.  This is clearly an outlier and perhaps it should be disregarded as it hasn't gone through peer review, but another perspective on it is that it reflects the weight of that original finding on the field.
"This author knows of only one study that has examined nonintelligible speech-like sounds; and interestingly, no significant pSTS activation was found (Scott et al., 2000)."  
Conclusion: My take is that many investigators are citing Scott et al. appropriately, as Jonathan Peelle suggested.  I also see some evidence, however, that the study biases some researchers toward the view that the left anterior STS/STG is the critical region for speech recognition/lexical processing.  Put differently, I still believe that if the original study had found bilateral activity in both anterior and posterior regions, we would as a field be in somewhat of a different place, with fewer groups emphasizing the exclusivity of the anterior pathway (e.g., the Alho et al. quote).  In this sense, I see evidence in support of the assumption that motivated the tweets in the first place.

Was I too harsh?  I think yes.  Even though I still believe the original study is over-emphasized theoretically and inaccurately colors interpretation of the neural basis of speech processing in some circles, the content of my initial tweets made it sound like the work is completely useless.  That was not my intention or my belief and for that, Sophie and all, please accept my apology.

Lessons for all of us: In a 140 character statement it is way too easy to come across in a way that wasn't intended. I will certainly re-read my draft tweets and consider how they might be read.  I'm sure this won't prevent alternative readings but it could help reduce them. At the same time I will also give other tweeters a little slack if I read something that rubs me the wrong way and ask for clarification.

1 comment:

Sam said...

Readers may be interested in this: http://journal.frontiersin.org/article/10.3389/fnhum.2017.00041/full which provides an overview of the 4 studies associated with the Scott et al. (2000) study (the original + 3 fMRI replications). This short review paper outlines what has been learnt from them about the "what" pathway and about fMRI replication more generally. It asks "how similar do two statistical brain maps have to be to constitute a successful replication?" - which seems to be pertinent in the context of the above discussion.