Most experiments on "speech perception" ask participants to discriminate pairs of syllables or identify which sound they heard. The recent paper by D'Ausilio et al. used such a measure and in their response to my commentary some questions were raised about my "task specific effect" comment (I decided not to address them for lack of energy, but I'll tell you why I don't buy their arguments if anyone wants to know). A recent thoughtful comment here on Talking Brains by Marc Sato included a quote that sparked sufficient energy to motivate a few words on my part though. The quote was from a recent paper (this is not a dig against anything Marc said, it's just that this quote got me thinking):
speech perception is best conceptualized as an interactive neural process involving reciprocal connections between sensory and motor areas whose connection strengths vary as a function of the perceptual task and the external environment.
I don't know what other folks are studying when they study speech perception, but to me speech perception is best conceptualized as that process that allows a listener to access a lexical concept (~word meaning) from a speech signal. This is what "speech perception" does in the real world. It is one step in the conversion from variations in air pressure into meaning. I'm pretty sure the capacity for "speech perception" didn't evolve or develop to allow us to tell an experimenter whether we heard a /ba/ or a /pa/. In fact, the next time you have the pleasure of talking to a speech scientist who regularly employs such methods, pause after a sentence you speak and ask if in the last sentence you uttered the syllable /ba/ or not. S/he will have no idea; we don't perceive phonemes, we perceive word meanings. For the most part, the ability to make conscious decisions about phonemes is a useless ability in the context of auditory speech processing, one that is probably only available to literate individuals by the way (I can dig up some refs if anyone is interested). If you are interested in studying the ability to make judgments about speech sounds, that is perfectly fine; after all it appears to be highly relevant to reading -- an important issue. But don't assume that you are studying anything that is necessarily relevant to what happens in the real world of auditory speech processing.
Let me really stick my neck out and say this: if you are going to use a task that requires listeners to make judgments about speech sounds (syllable discrimination or identification), then in order to make the claim that you are studying anything relevant to how speech is actually processed in the real world, you better have some empirical data to back it up; i.e., it better hold for comprehension and not just metalinguistic judgments.