Do babies perceive speech with their tongues?

Recently Alison Bruderer (a post-doctoral fellow in Janet Werker’s lab) reported to @ASHAweb that impediments to infants’ tongues movements impair their ability to perceive speech, concluding further that her results, published in PNAS, support the motor theory of speech perception. Tara McAllister Byun (@byunlab) has commented on social media quite adequately in her own right but has asked me to follow up with a blogpost since we share the same concerns, so here I go! This is the long version of Tara’s complaint.

Some background about infant speech perception testing is required. Infant speech perception research exploded after Peter Eimas and colleagues perfected a technique called the high amplitude sucking procedure in the early 1970s. With this procedure, the infant sucks on a special soother while specific speech input is provided to the infant contingent upon a criterion rate of sucking; there is an initial increase in the infant’s sucking rate followed by a decline as the infant habituates to the stimulus; subsequent to habituation, a new stimulus is introduced and an increase in the infant’s sucking rate is interpreted as evidence that the infant discriminates the new stimulus from the initial input. As shown in the picture (borrowed from Scientific American, scandalously without permission!), you can see that the infant must be able to perceive speech with the tongue fully occupied in order to succeed at this experiment. And indeed they did – Eimas demonstrated that newborn infants can perceive many contrasts and everyone was so excited by this that textbooks to this day claim that all infants can perceive all the phonetic contrasts of all the world’s languages at birth even though this is not at all true notwithstanding the fact that infant speech perception abilities are indeed remarkable.Scientific American HAS

Since that time many new test techniques have been developed, some employing this idea of habituation in one form or another and others using reinforcement of specific responses to speech input. What do we know from thousands of studies since published? Infants can indeed perceive many contrasts but not all of them. Some contrasts are much easier for infants to perceive than others – for example fricative contrasts remain difficult to perceive throughout infancy and corner vowels are more salient than central vowels. The specific acoustic characteristics of the stimuli relative to the infant’s experience are very important and impact the infants’ performance in surprising ways. Infant performance is highly vulnerable to memory and attention constraints and many infants, sometimes the majority tested, do not complete the test procedures. Test-retest performance for individual babies is close to zero.  The procedures are so finicky that when Nittrouer pointed out that that many infants cannot perceive fricative contrasts, she was criticised for testing the infants in a high chair, rather than in their mothers’ laps as is customary in other labs! Infant perception is shaped by environmental speech input such that perceptual sensitivity for native language contrasts is enhanced while perceptual sensitivity for foreign language contrasts declines during the first year of life. There are discontinuities in perceptual performance during development as perceptual knowledge is integrated with other linguistic skills, so that for example, the ability to perceive phonetic contrasts seems to disappear and then reappear around 12 months of age as the infant reorganizes perceptual knowledge to serve the process of word recognition.

The procedure used by Bruderer involves presenting the infant with a series of alternating and nonalternating trials in which some stimuli are different versions of the same phonetic category and other stimuli come from different phonetic categories, like this:

trials bruderer werkerThere are actually 8 trials but the point is nonalternating (Nalt, 2 stimuli from the same category such as dental) and alternating (Alt, 2 stimuli from different categories such as dental vs retroflex) trials are interleaved across the experiment so that the infant’s behavior can be compared for these two trial types, in pairs (i.e., trial 1 vs trial 2 is pair 1 and so on). The behavior that is recorded is “looking time”. It is assumed that if the infant looks at a checkerboard pattern longer when hearing Alt trials, in comparison to Nalt trials, then the infant can hear the difference between the stimuli during the Alt trials. You also expect that looking times will generally decline over the experiment. So you hope to see results as shown in the figures below. I had to make these up because of paywalls on the relevant articles but the left figure is similar to what Bruderer found for 6 month old infants in her experiment (they can perceive the foreign language contrast as revealed by divergence in the lines) and the right figure is similar to what Yeung and Werker found for 9 month old infants (they can no longer perceive the contrast as revealed by overlapping lines).

Bruderer Werker simulated results

Hypothetical but realistic 6 mo vs 9 mo looking times (no teethers)

So now we can ask what you would expect if putting a teether in the infant’s mouth interfered with perception because… motor theory of speech perception! I would expect looking times on Alt trials to be low and overlapping with looking times for Nalt trials, as in the figure on the left. What actually happened? When a flat teether was put in the infants’ mouths, looking times for both Alt and Nalt trials started very high and then dropped steadily over the remaining three pairs of trials, overlapping to give the impression that the infants could not discriminate the stimulus pairs in the Alt trials. Then they ran a different group of infants with a “gummy” teether in their mouths – in this case tongue movement was not inhibited. Here, looking time was high during pairs 1, 2, and 3 and dropped during pair 4 but only in the Nalt condition; therefore the lines diverged in pair 4. In other words it looks as if teethers cause looking times for Nalt trials to go up, rather than looking times during Alt trials to go down!  I can’t actually prove that because you cannot really compare across groups of babies in this study. But, the researchers’ conclusions are based on a comparison of p values across the 3 groups of babies: when looking times across trials types are compared for each group of babies the following statistical results were reported: Group 1 (no teether) F(1,23) = 4.32, P = 0.049; Group 2 (flat teether) F(1,23) = 0.011, P = 0.92; and Group 3 (gummy teether) F(1,23) = 5.26, P = 0.031). Comparing Group 2 with 3 they conclude that inhibiting tongue movements impairs speech perception.

What are the reasons to be cautious about the interpretation of these results? The first issue is that the conclusions are drawn by essentially comparing the p value obtained for Group 2 with the p value obtained for Group 3, and concluding the two groups are different, not a safe assumption (see Gelman and Stern on this point). There is no direct evidence that the behavior of the infants between these two groups is substantially and significantly different. Further to the interpretation, it is difficult to be sure what the infants are perceiving or not perceiving as they participate in this task. Recall that our perception is intimately related to attentional factors – in fact our brains do not lose the ability to perceive foreign language phonetic contrasts, we learn to ignore them, as is elegantly shown by Choeur’s ERP studies. The introduction of the teethers appears to change the habituation behavior of the infants in the study and thus it is not clear how to interpret their looking time. Possibly the teethers shift attention to somatosensory feedback and looking time indexes those perceptual inputs, and nothing at all about speech perception abilities. Alternatively, the teethers may enhance overall arousal in such a way that the infants are actually as sensitive to the within category differences (Nalt trials) as they are to between category differences (Alt trials) in speech sounds. Thirdly, it is possible, that at 6 months of age there is a temporary disruption in speech perception in the flat teether condition that reflects an emerging link between brain areas for perception and speech production that Kuhl says is activated by speech practice in babbling. In this case I am taking the reported results at face value but I am not accepting that they are evidence for the motor theory of speech perception which posits that the objects of speech perception are innately the articulatory gestures themselves. In any case, many more studies with varying design and analysis strategies are required before we can be sure of the interpretation of these intriguing findings. Once again Janet Werker’s lab is at the forefront of an exciting new area of infant research.


Aslin, R.N., Werker, J.F., & Morgan, J.L. (2002). Innate phonetic boundaries revisited. J. Acoust. Soc. Am. 112, 1257 (2002);

Bruderer, A. G., Danielson, D. K., Kandhadai, P., & Werker, J. F. (2015). Sensorimotor influences on speech perception in infancy. Proceedings of the National Academy of Sciences, 112(44), 13531-13536. doi: 10.1073/pnas.1508631112

Cheour, M., Ceponiene, R., Lehtokoski, A., Luuk, A., Allik, J., Alho, K., & Näätänen, R. (1998). Development of language-specific phoneme representation in the infant brain. Nature Neuroscience, 1, 351-353.

Cristia, A., Seidl, A., Singh, L., & Houston, D. (2016). Test-retest reliability in infant speech perception tasks. Infancy, Early View, 1-20.


Eimas, P. D. (1985). The perception of speech in early infancy. Scientific American, 252, 46-52.

Gelman, A., & Stern, H. (2006). The difference between “significant” and “not significant” is not itself statistically significant. The American Statistician, 60(4), 328-331,DOI: 310.1198/000313006X000152649.

Imada, T., Zhang, Y., Cheour, M., Taulu, S., Ahonen, A., & Kuhl, P. K. (2006). Infant speech perception activates Broca’s area: a developmental magnetoencephalography study. Neuroreport, 17, 957-962.

Kuhl, P. K. (2004). Early language acquisition: Cracking the speech code. Nature Reviews: Neuroscience, 5, 831-843.

Nittrouer, S. (2001). Challenging the notion of innate phonetic boundaries. Journal of the Acoustical Society of America, 110(3), 1598-1605.

Mattock, K., Polka, L., Rvachew, S., & Krehm, M. (2010). The first steps in word learning are easier when the shoes fit: comparing monlingual and bilingual infants. Developmental Science, 13, 229-243.

Yeung, H. H., & Werker, J. (2009). Learning words’ sounds before learning how words sound: 9-month-olds use distinct objects as cues to categorize speech information. Cognition, 113, 234-243

Leave a comment


  1. Heidi Diepstra

     /  February 24, 2016

    Appreciate your critical review of this study. In defense of the authors however, nowhere in the original PNAS publication do I read that the authors claim that the results are evidence for the motor theory of speech perception or that “objects of speech perception are innately the articulatory gestures themselves”. In fact, they note “the current research cannot distinguish whether there is a single, innate representation for the perception and production of speech as in the strong version of the motor theory of speech perception or whether perception and production are separate but linked, with each guiding and informing the other throughout the period of language acquisition.” (p. 13534) Interestingly, ASHA leader, as well as Scientific American, immediately report on the results as “New data supporting the Motor Theory of Speech Perception ..”. In the original article the authors seem to be cautious about drawing such strong conclusion. I’m now wondering if the journalists’ interpretation of the results indeed matches the authors’ original interpretation?

    • Thank you for your comment Heidi, it is indeed interesting to notice the difference in the interpretation offered in the published paper in comparison to the comments made in the ASHA Leader interview. There are two possible and not mutually exclusive explanations. One is that there are two authors, one who may have a more cautious interpretation of the findings that the other. Second, peer reviewers usually insist that authors temper their conclusions, frequently with more caution than the authors might like. As I said, speaking for myself in the blog, my gut feeling is that there is no actual difference in infant responding across Groups B and C but I cannot prove that given the data provided. All of us have to separate our “gut feelings” from the scientific facts but I think that it is actually helpful to acknowledge “gut feelings”. If we pretended as scientist that we are too “special” and “objective” to have them we would not be able to differentiate subjectivity from objectivity. I also think that the kind of enthusiasm and partisanship that leads us as scientists to sometimes over-interpret our own work is a good thing – otherwise we would never be crazy enough to do science which is a terribly difficult thing to do for years and years with no guarantee of a good outcome. However the peer review process has its uses! Thank you again for reading my blog and taking the time to comment.


Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

%d bloggers like this: