Speech Therapy and Speech Motor Control: Part 3

In two previous blogs I discussed a recent paper by Strand in which she outlines in detail the theoretical foundation and procedural details of Dynamic Temporal and Tactile Cueing (DTTC) as a treatment for Childhood Apraxia of Speech (CAS). In Part 1 I suggested that the theoretical base, being Schmidt’s “Schema Theory of Discrete Motor Skill Learning,” was outdated. In Part 2 I discussed modern theories of speech motor control that assume a dynamic interplay of feedforward and feedback control mechanisms. In this blog I will discuss the implications for speech therapy, in relation to critical aspects of DTTC.

First, let us consider the core element of DTTC, “the focus on the movement (rather than the sound or phoneme) in terms of modeling, cueing, feedback, and target selection” (p. 4). I believe that all of us who strive to help children with CAS acquire intelligible speech agree that speech movements are the focus of speech therapy, as opposed to phonological contrasts. Nonetheless, this statement raises questions about the nature of “speech movements.” What is the goal of a speech movement? The answer to this question is controversial: it may be a somatosensory target involving specific articulators, such as for example bring the margins of the tongue blade into contact with the upper first molars; or it may be to produce a particular vocal tract shape such as a large back cavity separated from a small front cavity by a narrow constriction; or it may be to produce an acoustic output that will be perceived as the vowel [i]. The DTTC is structured to promote precise and consistent movements of the articulators and therefore the first scenario is presumed. Furthermore, the origin of CAS is hypothesized to be a deficit in proprioceptive processing that arises from an impairment in cerebellar mechanisms. Updating the theory, this hypothesis would implicate feedforward control which, following from Guenther and Vladosich (2012), “projects directly from the speech sound map [in left ventral premotor cortex and posterior Broca’s area] to articulatory control units in cerebellum and primary motor cortex” (p. 2). However, new research (Liégeois et al., 2019) identifies the locus of structural and functional impairments underlying CAS as being along a dorsal pathway of cortical structures, specifically: reduced white matter and fMRI activations in sensory motor cortex and along the arcuate fasciculus and reduced grey matter and fMRI activations in superior temporal gyrus and angular gyrus. They explain that “this route links auditory input/representation to articulatory systems … and transforms phonological representations into motor programs …In contrast, the speech execution white matter pathway (corticobulbar) and the ventral language route (IFOF) were not altered in this family” [that showed multigenerational impairments in speech praxis]. My point is that although the cerebellum is important to speech motor control and CAS may well involve impairments in proprioceptive feedback, speech is clearly a sensory motor skill that requires close connection among articulatory and auditory representations for sounds and syllables.

In Part 2 of this blog series I indicated that adults can compensate for unexpected perturbations to articulatory trajectories or auditory feedback very rapidly by drawing on their internal model of vocal tract function. It is interesting to consider that throughout speech development children cope with perturbations to articulatory gestures and expected acoustic outputs because their vocal tract is changing shape, sometimes quite dramatically, throughout childhood. Callen et al. (2000) showed how the developing child can adapt to the changing vocal tract by aiming for relatively stable auditory targets (conceived of as regions in auditory space) and using auditory feedback and simulations of auditory outputs to achieve those targets even as vocal tract structure is changing. The key to this remarkable ability is a learned mapping between articulator movements, vocal tract shapes and auditory outputs. The learning and updating of this internal model of vocal tract function arises from an unsupervised learning mechanism, essentially Hebbian learning: young infants engage in a great deal of unstructured vocal play as well as somewhat more structured babbling – speech practice that allows them to learn the necessary correspondences without having specific speech goals. Infants with CAS are widely believed to skip this period of speech development; therefore, it is likely they begin speech therapy without an internal model of vocal tract function which is foundational for goal directed speech practice. Therefore, precise, repeated, consistent speech movements may not be the best place to start a treatment program for severe CAS; a program of unstructured vocal play that targets highly varied playful vocalizations is a better starting place for many children. Subsequently, high intensity practice with babble (repetitive syllable production) will stabilize the mappings between articulatory gestures and the resulting vocal tract configurations and somatosensory and auditory outcomes.

One of the advantages of a well-tuned internal model of vocal tract function is that it supports “motor-equivalent speech production” given commonly occurring constraints on speech production. In other words, there are many different articulatory gestures that will produce the same acoustic-phonetic goal. When the child has a stable acoustic-phonetic target and is able to process auditory feedback in relation to that target, various articulatory solutions can be found to adapt to changing vocal tract structure or constraints such as talking while eating or a holding a pen between the teeth. Developmental changes in the way that articulators are coordinated to produce the same phoneme are well documented in the literature. Similarly speech production varies with phonetic context. Motor equivalent trading relations between tongue body height and lip rounding are well known for production of the vowel [u] and the consonant [ʃ] for example and the front-back positioning of the constriction in these phonemes is highly variable across speakers and phonetic contexts. The precision with which these phonemes are produced is related to the talker’s perceptual acuity: for example, adults who have sharp perceptual boundaries between [ʃ] and [s] produce them with greater articulatory consistency as well as greater acoustic contrast between the phoneme categories. Perkell et al. (2004) speculated “In learning to maximize intelligibility, the child with higher acuity is better able to reject poor exemplars of each phoneme (as in the DIVA model), and thus will adopt sensory goals for producing those phonemes that are further apart than the child with lower acuity.” The implications for speech therapy are that, even in the case of CAS, ensuring stable acoustic-phonetic targets for speech therapy goals is essential whereas insisting upon SLP defined articulatory parameters may be counter-productive. The goal is not absolute  consistency in the production of specific motor movements, but rather, dynamic stability in the achievement of speaking goals.

Although it is speculated that feedforward control is weighted more heavily than feedback control in adult speech, feedback is critical to speech learning during infancy and childhood. Furthermore, auditory feedback plays a crucial role. The initial goal is an auditory target. Guenther and Vladusich (2012) explain that “the auditory feedback control subsystem [helps to] shape the ongoing attempt to produce the sound by transforming auditory errors into corrective motor commands via the feedback control map in right ventral premotor cortex” (p. 2). They further explain that repeated practice of this type eventually leads to the development of somatosensory goal regions. A particular frustration for children with CAS is perseveration, the difficulty of changing a well-learned articulatory pattern to a new one that is more appropriate. This problem with perseveration highlights the need to engage the feedback control system. There are two strategies that are essential: first a high degree of variation in the practice materials which can be introduced by practicing nonsense syllables with a carefully graded increase in difficulty but variation in the combination of syllables within difficulty levels. The second strategy is to provide just the right amount of scaffolding along the integral stimulation hierarchy so that the child will be successful more often than not while experiencing a certain amount of error. Some error ensures that corrective motor commands will be generated from time to time. Imagine practicing syllables that combine four consonants [b, m, w, f] with four vowels [i], [u], [æ], [ɑ] and four diphthongs [ei], [ou], [ɑi], [au], [oi], presented at random so that the child imitates the first syllable (Say [bi]) and then repeats it again twice (Say it again… and again…), before proceeding to another syllable. You will have a great many targets in your session but created from a small number of elements. Imagine further that you progress to a more difficult level (reduplicated syllables, [bubu], [mimi]) as soon as the child achieves 80% correct production of the single syllables. You can see that you will also be allowing the child to produce quite a bit of error. We call this the challenge point. Tanya Matthews, Francoise Brosseau-Lapré and I are working on a paper to describe how to do this and describe our experiences with the approach. You will see that it is very different from working on five words and requiring that the child achieve 15 to 20 correct productions at the imitative word level before proceeding to delayed imitation and then again before proceeding to spontaneous productions. Errorless learning is a fundamental aspect of DTTC and has a long history in speech therapy practice. However it is not clear that it is well-motivated from the perspective of developmental science.

To summarize, there are many aspects of DTTC that are similar across all sensory-motor approaches to the treatment of CAS. In particular high intensity speech practice is well motivated and likely to be effective with all forms of moderate and severe speech sound disorder. Nonetheless there are some significant differences between Strand’s approach and the approach that I recommend based on an updated theory of speech motor control. There is still a great deal of research to do because very few of our specific speech therapy practices have received empirical validation even though speech therapy in general has been shown to be efficacious. As a guide to future research (hopefully using randomized and thus interpretable designs), I provide a table of procedures that are similar and different across the two theoretical approaches.




Treatment Procedures that are Similar

High intensity practice
Focus on speech movements (not phonemes)
Practice syllable sized units (not isolated sounds)
Attend to temporal aspects of trial structure (delayed imitation, delayed provision of feedback)
Integral stimulation hierarchy (attend to visual and auditory aspects of target)

Treatment Procedures that are Different

Focus on precise, consistent movements Focus on dynamic stability
Over-practice: accuracy over 10-20 trials Variable practice when possible
Errorless learning Challenge point: 4/5 correct, then move up
Behavioral shaping of accurate movements Motor equivalent movements
Tactile and gestural cues to ensure accuracy Sharpen knowledge of auditory target
“Hold” initial configurations Encourage vocal play, develop internal model


Callan, D. E., Kent, R. D., Guenther, F. H., & Vorperian, H. K. (2000). An auditory-feedback-based neural network model of speech production that is robust to developmental changes in the size and shape of the articulatory system. Journal of Speech, Language, and Hearing Research, 43, 721-738.

Guenther, F. H., & Vladusich, T. (2012). A neural theory of speech acquisition and production. Journal of Neurolinguistics, 25(5), 408-422.

Liégeois, F. J., Turner, S. J., Mayes, A., Bonthrone, A. F., Boys, A., Smith, L., . . . Morgan, A. T. (2019). Dorsal language stream anomalies in an inherited speech disorder. Brain, 142(4), 966-977.

Perkell, J., Matthies, M., Lane, H., Guenther, F. H., Wilhelms-Tricarico, R., Wozniak, J., & Guiod, P. (1997). Speech motor control: Acoustic goals, saturation effects, auditory feedback and internal models. Speech Communication, 22, 227-250.

Perkell, J., Matthies, M. L., Tiede, M., Lane, H., Zandipour, M., Marrone, M., . . . Guenther, F. H. (2004). The distinctness of speakers’ /s/-/ʃ/ contrast is related to their auditory discrimination and use of an articulatory saturation effect. Journal of Speech, Language, and Hearing Research, 47, 1259-1269.

Rvachew, S., & Matthews, T. (2017). Demonstrating treatment efficacy using the single subject randomization design: A tutorial and demonstration. Journal of Communication Disorders, 67, 1-13.

Rvachew, S., & Matthews, T. (2019). An N-of-1 Randomized Controlled Trial of Interventions for Children With Inconsistent Speech Sound Errors. Journal of Speech, Language, and Hearing Research, 62, 3183–3203

Speech Therapy and Theories of Speech Motor Control: Part I

Edy Strand recently published a detailed description of her Dynamic Temporal and Tactile Cueing treatment strategy. As she says this is a hugely valuable paper because it provides a complete description of a treatment designed for severe speech sound disorders, especially Childhood Apraxia of Speech, and more importantly, it summarizes in one place the theoretical foundation for the treatment. I think that, on the whole, this is an efficacious treatment although there are some procedures, derived directly from the outdated theoretical underpinnings, that are questionable however, and therefore I am going to devote several blogs to more recent theory and basic science research on the development of speech motor control and apraxia of speech. In this first blog, I review Schema Theory, even though this theory is just not right! But it has a long history and remains currently popular across almost all clinically-oriented papers on motor speech disorders.

The theory that is referenced in Edy Strand’s paper is Richard Schmidt’s “Schema Theory of Discrete Motor Skill Learning,” published in Psychological Review in 1975 and subsequently brought to speech-language pathology by Ray Kent and others as a useful framework for thinking about speech therapy. The important idea underlying this theory is that motor skills are made up of brief, discrete motor acts that are executed all-at-once as open-loop generalized motor programs, adapted with specific response specifications (called parameters) for the current conditions. The theory assumes “open-loop” control because sensory feedback is often too slow to impact movement after it has started. According to this theory feedback is processed after the movement is over and incorporated into the schema for the future execution of the generalized motor program. I have used golf as an example before; even though I haven’t played much in years let’s do it again: if we are adopting this theory we would think of practice sessions as developing different generalized motor programs for each type of shot, a long drive, a short 7-iron shot, the up-and-down pitch onto the green, and the putt into the hole. Which shot you choose depends upon your recall schema: what is your target and which type of shot is likely to achieve it? I personally recall that when close to the green my pitch is better than my chip (whereas my husband has the opposite preference). How you address the ball depends upon the initial conditions (flat ground, hill, tall grass etc.). The motor control parameters (also known as response specifications) depend upon the distance to the target (how high to lift the club, speed of follow through, force applied and so on). Based on the initial conditions and the desired outcome, I launch the shot with my wedge, expecting a certain “feel” as I hit the ball based on past experience with the sensory consequences of hitting this shot; I can always “recognize” a good hit even before I see the ball land (often I just turn my back on the ball, I don’t even want to see it land!). But in any case, the actual outcome is important for updating the “recall” schema; specifically, if I have actually achieved my target, I add all this information, the initial conditions, the response specifications, the recognition schema and the recall schema to my memory. The generalized motor program is an abstraction across all these remembered practice trials, permitting correct specification of the response parameters in future shots. Furthermore, I should be able to adapt the generalized motor program to similar shots, even if the ball is a little further or closer to the green for example.

When applied to CAS, in which current research suggests unreliable or degraded somatosensory feedback, the use of this model focuses attention on the child’s processing of initial conditions, inaccurate planning or programming of the movement due to poor selection of response specifications, and/or poor recognition schema (not knowing when the movement “feels right”). Therefore, certain procedures are recommended. DTTC providers use manual or gestural cues to shape the child’s articulators into the “initial position” and encourage the child to “hold” the position momentarily so as to fully process those initial conditions before launching the movement. During the initial stages of therapy, the SLP uses a slow rate and co-production so that the child is getting extra feedback during the practice trial, presumably with the goal of stabilizing the recognition schema. Imitative models support the child’s knowledge of the target which, when combined with copious knowledge of results feedback should support the development of recall schema. And finally, a great deal of practice with an errorless approach ensures that the child lays down many memory traces of correctly executed motor programs.

The recommendations that are provided make a certain amount of sense given the context of schema theory (even though there is in fact no evidence for the specific efficacy any one of these particular procedures). The problem is that it is not clear that schema theory is a reasonable foundation for modern speech therapy practice.

First, citing Richard Schmidt himself, he cautioned in 2003 that “schema theory was intended to be an account of discrete actions. Hence, continuous actions, such as steering a car or juggling, which are both of longer duration (allowing time for response-produced feedback to have a role) and more based on the performer’s interactions with the environment were outside the area for schema theory…long-duration actions might be based on interplay between open-loop subactions and feedback-based corrections… . Interestingly, tasks such as juggling seem appropriate for analysis in terms of the dynamical systems perspective” (p. 367). I would argue that our understanding of, not only juggling, but speech motor control has benefited immensely from the dynamical systems perspective and I will come back to that in the next blog. If juggling is considered too complex and continuous to be explained by schema theory, probably speech is not a good fit either.

Second, modern theories of speech motor control have shown that on-line correction of motor action even over short durations occurs despite the limitations of feedback control. The explanation lies in the continuous operation of feedforward control mechanisms. More on feedforward control in another blog.


Rvachew, S., & Brosseau-Lapré, F. (2012). Developmental Phonological Disorders: Foundations of Clinical Practice. San Diego, CA: Plural Publishing.

Schmidt, R. A. (1975). A schema theory of discrete motor skill learning. Psychological Review, 82(4), 225-260. doi:10.1037/h0076770

Schmidt, R. A. (2003). Motor schema theory after 27 years: Reflections and implications for a new theory. Research Quarterly for Exercise and Sport, 74(4), 366-375.

Strand Edythe, A. (2019, Early View). Dynamic Temporal and Tactile Cueing: A Treatment Strategy for Childhood Apraxia of Speech. American Journal of Speech-Language Pathology. doi:10.1044/2019_AJSLP-19-0005

Using Orthographic Representations in Speech and Language Therapy

Word learning, and in particular, productive word learning is associated with three important processes in the phonological domain: first, the child must encode the acoustic-phonetic form of the word in the language input; second the child must transform this representation into a lexical representation, generally considered to take on a more abstract phonological form; finally the child must retrieve the representation to reproduce it. The first process is reliant on speech processing abilities that have been shown to be impaired in many children with speech, language and reading deficits, as shown by for example by Ben Munson and colleages (@benjyraymunson) and Nina Kraus and colleages. Phonological encoding is enhanced by access to repeated high-quality but variable inputs as shown by Richtmeier et al for normally developing children and by Rice et al for children with SLI. The majority of children with SSD have difficulties with encoding: we have a paper in press with the American Journal of Speech-Language Pathology showing that speech accuracy in these children can be improved with an approach that focuses largely on the provision of intense high quality input – I will have more to say on this subject when it (finally) emerges in print.

The second process, forming a phonological representation and storing it in the lexicon, involves articulatory recoding which can be a serious problem for children with severe SSD, accounting for deficits in speech accuracy (especially in association with inconsistency), nonword repetition, word learning, productive vocabulary, word finding, rapid automatic naming, and other phonological processing skills. These children are often diagnosed with motor planning disorders but I have pointed out previously that the problem is actually at the level of phonological planning. I have further pointed out the very close relationship between speech planning and memory. Children who are having difficulty with phonological planning may not show the same benefit from a therapy approach that is focused on the provision of high quality inputs. Therefore a new paper on the use of orthographic inputs to teach new words caught my eye.  Ricketts et al taught children with SLI and ASD as well as younger and age-matched children with typical language to label nonsense objects with new names, using a computer program. For some words, the children were exposed only to the object–auditory word pairing; for others they saw the object, heard the word and saw a printed version (orthographic representation) as well. All children found it easier to learn the new words when they were exposed to the orthographic representation along with the auditory word.

This study reminded me of the research we are doing with children who are referred to our clinic with an apraxia diagnosis due to inconsistent speech errors. So far, 40% of those children have difficulty with phonological planning rather than motor planning as revealed by the syllable repetition test, as I have explained in a previous blog. We have been using a single subject randomization design to compare the relative efficacy of two treatment approaches with these children. The Phonological Memory & Planning (PMP) intervention pairs the phonemes in the target words with visual referents that include letters as shown here. Imitative models are avoided and the child is encouraged to create their own phonological plan and produce the word using the visual symbols when necessary. An alternative treatment, the Auditory-Motor Integration (AMI) Treatment is quite different with a heavy emphasis on prior auditory stimulation and self-judgments of the match between auditory inputs and outputs. A third condition is a usual care CONtrol condition focusing on high intensity practice. In all cases we teach nonsense words paired with real objects, with the words structured to target the children’s phonological needs in the segmental and prosodic domains.

The results are assessed by applying a resampling test to probe scores and then combining p-values across the children. These are the statistical results (F and t tests by resampling test) for the Same Day Probe Scores, with p-values combined across the 5 children who have proven to have phonological planning problems in concert with a severe inconsistent speech disorder:

TASC PMP results Aug 2015

The results in the third column show that all of the children obtained a significant treatment effect. The findings in the remaining columns pertain to planned comparisons with positive t values being in the expected direction. The combined p values indicate that all treatments are significantly different from each other and inspection of the mean scores across children show that the pattern of results is PMP > CON > AMI. The result is made more interesting by the fact that the pattern of results is the exact opposite for children with a motor planning disorder. Tanya Matthews and I will compare these two subgroups with data and video during our presentation at ASHA 2016 in Denver this coming fall.

Session Number: 1429
Session Title: Differential Diagnosis of Severe Phonological Disorder & Childhood Apraxia of Speech
Day: Friday, November 13, 2015
Time: 1:00 PM – 3:00 PM
Session Format: Seminar 2-hours

For now, the take away message is that learning new words involves (at least) three important processes: encoding the sound of the new word, memory processes for storing and retrieving the phonological representation and motor planning processes for planning and programming articulatory movements prior to production of the new word. There are published studies showing that intervention procedures targeting each of these processes help children with speech, language and reading difficulties. Increasing frequency of high quality input improves quality of the acoustic-phonetic representation. Pairing phonological segments with visual symbols helps with storage and retrieval of the phonological representation. High intensity speech practice with appropriate stimulation and feedback improves motor planning and motor programming. The trick is to figure out which children require which procedures at which time.


I greatly enjoyed this new Frontiers in Neuroscience paper by Hickok and colleagues called “Partially overlapping sensorimotor networks underlie speech praxis and verbal short-term memory: Evidence from apraxia of speech following acute stroke”. These researchers evaluated 76 patients during the acute phase of their stroke using behavioral and MRI measures. They found a strong relationship between apraxia (AOS) and verbal short- term memory (vSTM) difficulties as well as weak relationships between aphasia and AOS and vSTM upon behavioral testing. For patients with AOS, the MRIs revealed tissue damage along a sensorimotor network of motor-related areas and sensory-related areas. The motor related areas that were implicated were as follows: primary motor cortex (proposed site of motor programs for opening and closing vocal tract gestures that correspond roughly to consonant and vowel phonemes); pars opercularis (a part of Broca’s area involved in phonological processing and suppression of response tendencies);  premotor cortex (planning and sequencing of speech units and sensory guidance of movement; motor programs for syllables); and insula (specialized for motor planning of speech). The sensory-related areas  associated with AOS were primary somatosensory cortex (site of  somatosensory targets for speech); secondary somatosensory cortex (sensorimotor integration); parietal operculum (sensory motor interface for speech); and auditory cortex (processing of auditory information; auditory targets for speech). The areas associated with vSTM deficits overlapped those associated with AOS but only in the motor-related areas, specifically pars opercularis and par triangularis (i.e., Broca’s area), premotor cortex and primary motor cortex.

With regard to the network associated with AOS, the authors concluded that the findings demonstrate “that the targets for speech are sensory in nature” and that “motor control generally and speech motor control specifically is dependent on sensorimotor integration”. I found these conclusions to be interesting in view of our interventions studies with children who have childhood apraxia of speech. As I reported in a previous blog, we are having success with an approach in which we encourage strengthening of both articulatory-phonetic and acoustic-phonetic representations for target words and the connections between them.

With regard to vSTM, the authors indicate that “the involvement of motor areas is predicted as vSTM involves an articulatory rehearsal component”. They seem  surprised however that “posterior, sensory related regions” were not implicated in this study as correlates of the hypothesized “storage” component in short-term memory. This finding reminded me of a paper I wrote in 2008 in which I pointed out that children’s nonword repetition performance, supposedly a measure of vSTM, factors with speech production accuracy rather than language ability in large scale studies involving children with either typical or atypical language development. I interpreted these findings in relation to a connectionist model of working memory proposed by MacDonald and Christensen (2002). According to this model there is no short term memory store per se because  working memory is not differentiated from linguistic knowledge and processing. Individual differences in working memory task performance reflect differences in precision of phonological representations and processing efficiency due to experiential and biological factors. The processes and representations involved in working memory are the same as those used in speech planning.  Many of the children that we are working with have difficulty planning an utterance – I have described these children with phonological planning difficulties in a previous blog. The children have difficulty with consistent repetition of nonwords and complex real words. The successful intervention for these children involves providing multimodal external cues to support the child’s efforts to construct and execute a plan to produce new words, as described in a previous blog. It is important that the SLP avoid providing an auditory  model for imitation by the child however although the SLP may imitate the child’s production to reinforce successful attempts or correct failed attempts.

Hickok et al interpret their findings in light of their hierarchical model although I remain uncertain about this notion of a hierarchical organization of these components just because I can never quite sort out what ‘higher” versus “lower” means when placing these kinds of components in a hierarchical relationship.  The importance of acquiring knowledge of different forms of linguistic representation – acoustic, articulatory, phonological and semantic – and linking across multiple representations to achieve functional goals has implications for typical and atypical language development however.

Tanya and I will be discussing these issues further (with video demonstrations) at ASHA2014:

Topic Area: Speech Sound Disorders in Children Session Number: 1037 Title: Differential Diagnosis of Severe Phonological Disorder & Childhood Apraxia of Speech Session Format: Seminar 2-hours Day: Thursday, November 20, 2014 Time: 10:30 AM ─ 12:30 PM Author(s): Susan Rvachew and Tanya Matthews

Dose Frequency for Effective Speech Therapy

I am writing to address a specific question that has come up: in order to be effective when treating an “articulation disorder” how many trials should the SLP elicit from the client per treatment session? This is an important question and it is surprising that so little research attention has been directed at uncovering the answer. This is a question about what Warren, Fey and Yoder (2007) refer to as “dose: number of properly implemented teaching episodes per session”. We could be talking about the number of presentations of a model or perceptual responses by the child when conducting an “input oriented intervention” but in this blog I will restrict my comments to those interventions that are focused on obtaining speech responses from the child and therefore the teaching episode involves practicing a speech behavior such as a sound, syllable, word or phrase and each elicitation is counted as a single dose. In speech therapy the question of optimum dose frequency (how many trials per session of a given length) comes up most often in the context of Childhood Apraxia of Speech (CAS) where it is generally believed that practice intensity is particularly important. Recently, Murray, McCabe & Ballard (2014) reported that studies on approaches for CAS typically involved 60 to 120 trials per session whereas studies on approaches for phonological disorders typically involved 10 to 30 trials per session. The closest I have seen to an experimental investigation of dose frequency is the single subject experiments conducted by Edeal and Gildersleeve-Neuman (2011) in which low intensity (30 to 40 trials/session) versus high intensity treatment (100+ trials/session) was compared within two children with CAS. They concluded that “Both children showed improvement on all targets; however, the targets with the higher production frequency treatment were acquired faster, evidenced by better in-session performance and greater generalization to untrained probes.”

I don’t see any reason why a higher intensity intervention would not also be a “good thing” when treating children with a phonological disorder and indeed this is what Williams (2012) concluded when she reviewed data from her lab. After a quantitative summary of treatment outcomes for 22 children who received her multiple oppositions intervention she recommended a minimum dose of 50 trials over 30 sessions with anything less being ineffective and higher doses (70 trials or more) being necessary for the most severely impaired children. In this case the children received 30 minute sessions twice per week.

Recently we have been conducting single subject experiments with children who have CAS and although treatment intensity is not the primary focus of attention in these studies my doctoral student, Tanya Matthews, and I have been looking at the relationship between dose frequency and outcomes. In the figures shown below the children’s “next day probe scores” (an indicator of maintenance of learning over a short-term period, expressed as proportion correct) are shown as a function of the number of trials completed (top chart) as well as the number of correct trials in each session (bottom chart). There is not much variability in the number of trials per session because we put a lot of pressure on the student SLPs to keep this number high. However the number of correct trials varies quite a bit depending upon the severity of the child’s speech delay and whether it is early or late in the child’s treatment program. The lower chart shows that next day probe scores are better if the number of correct trials in each 20 minute practice session is above 60. The number of correct trials never goes above 80 because we are working to keep the child “at challenge point” so if the child begins to produce more than 80% correct trials we make the task more difficult. However, if the child is producing many errors it does not really help to keep the response rate high either because the child is just practicing the wrong response anyway.

So to sum up, notwithstanding the rather poor quality and quantity of the data, my impression is that dose counts: regardless of whether the child has a motor speech disorder or a phonological disorder it is important to achieve as many practice trials as you can in a treatment session but it is also a good idea to ensure that the child is achieving accuracy at the highest possible level of complexity and variability during practice as well.

Number of trials by probe score

Phonological Memory and Phonological Planning

I have been writing about the children in our intervention study for children with Childhood Apraxia of Speech (CAS). So far about half of the children referred to us appear to have difficulties in the domain of phonological memory with their overt phenotype corresponding to the subtype described by Dorothy Bishop Dodd as Inconsistent Deviant Disorder. Shriberg et al. (2012) have developed the Syllable Repetition Task as one means of identifying deficits in “memory processes that store and retrieve [phonemic, sublexical, and lexical] representations. We have been using this SRT test to differentiate children who have deficits in phonological planning versus motor planning. I described the profile that corresponds to difficulties with motor planning (transcoding) in a previous post. Today I will discuss the phonological memory or phonological planning profile that we see in approximately half of the children that are referred to us with suspected CAS.

These children can be identified by a qualitative analysis of their SRT performance and by their performance on the Inconsistency Test of the DEAP. Starting with the SRT, one child in our study for example was able to achieve 12/18 consonants correct when imitating 2-syllable items but only 5/18 consonants correct when imitating 3-syllable items, thus exemplifying the classic profile of a child with phonological memory difficulties – better nonword repetition performance for short versus long items. Qualitatively he tended toward consonant harmony errors even with some 2-syllable items, /bama/=[mama],  /maba/=[mama],  and then more frequently with the 3-syllable items, /nabada/=[mamada]. Addition of syllables and vowel errors also occurred, /manaba/ = [mamadada],  /manabada/=[mimadama]. Poor maintenance of phonotactic structure and vowel errors were also observed on the Inconsistency Test, “helicopter” = [hokopapɚ], “elephant”= [ɛmpɩnt], which yielded an overall inconsistency score of 78% as many words were produced with multiple variants, e.g., “butterfly”= [bʌtfaɩ], [bʌtwaɩ], [bʌtətwaɩ].

The most striking illustration of the difficulties these children have with the storage and retrieval of phonological representations comes during our treatment sessions however. In this research program we are teaching the children nonsense words in meaningful contexts. For example in one scenario we teach the children the names of “alien flowers” and in one of the treatment conditions we use graphic stimuli, paired with gestural cues if necessary, to represent the syllables and phonemes in the words and phrases that we are teaching. Many of the children in our study learn all of the nonsense words without difficulty (5 words per goal/condition introduced over 6 45-minute sessions). However children with the phonological memory difficulties have great difficulty learning the words (SLP: This is a speet. Say speet. Child: speet. That’s right, speet. What is it? Child: I don’t know. SLP: Yes, you do it’s speet, the purple one, the purple one is speet, remember, say speet. Child: ‘speet’. SLP, you’ve got it, the purple flower is speet, it’s a speet, what is it, it’s a … Child: um, I don’t know, and so on).


The most effective intervention to use with these children closely mirrors the procedures described by Barbara Dodd as the “core vocabulary” approach and demonstrated by Sharon Crosbie in the video that accompanies their chapter in the Williams, McLeod and McCauley (2010) book. The video is lovely and shows how to use graphic stimuli and a chaining procedure to teach the child to produce a word consistently – the idea is to encourage the child to develop and implement their own phonological/motor plan rather than relying on an imitative model. The children respond to this technique really well and will learn to say the new words such as “speet” and “stoon” quickly and accurately. The trouble begins when our student SLPs want the children to use the new words spontaneously in phrases (e.g., “water the speet”). They have great difficulty remembering the word or even the carrier phrase without the imitative model and I have to work really hard to teach the student clinicians to withhold the imitative model in favour of using other cues to stimulate spontaneous production of the target words and phrases (SLP: What is it? Let’s start with the snake sound here…).

We have wonderful video of student SLPs learning these techniques as well as children achieving their goals. Tanya Matthews and I will be presenting them at ASHA 2014. The difference in the way that you implement therapy with these children is subtle but important. I am pretty sure that Case Study 8-4 in our book had a phonological planning deficit rather than the motor planning disorder that he was treated for. I can’t help but think that if he was treated with these techniques he might have made some progress in the three years that we followed his case (whereas he made literally no progress at all until he was treated with a synthetic phonics approach in second grade). I’d love to hear from you if you have any other ideas about how best to treat children with phonological memory problems and inconsistent deviant disorder.




Auditory Motor Integration Intervention for CAS

In March 2013 I described the research we are conducting in my lab to identify individual differences in response to two different approaches to the treatment of Childhood Apraxia of Speech. I also described the unique single subject randomization design that we are using and presented some data for one child without revealing the interventions that corresponded to the condition that worked best for this particular child. We have subsequently replicated this result with another child so today I am going to write about the features of the intervention that children with difficulties in the area of transcoding appear to benefit from most clearly. Recall that transcoding is revealed in part by addition errors on the Syllable Repetition Task. In the case of the child profiled in the previous blog, he added nasal consonants at syllable boundaries when asked to repeat the syllable strings and he was just as likely to do this for short strings as for long, e.g., “mada” → [bᴂndə] and “manabada” → [mandabad]. This child also had difficulty with multisyllable repetition during the maximum performance tests but no difficulty with the single syllable diadochokinetic rate. Within word inconsistency was borderline with inconsistent word productions largely reflecting single feature errors (voicing errors for example). Altogether the impression is of a true apraxia or motor planning disorder (as opposed to a phonological planning deficit, a more common problem that I will describe in a future post). Thus far we have assessed 18 children in this study and remarkably only 3 have presented with this particular profile.

Two of these children have shown the best response to an intervention that is directed at promoting auditory-motor integration. It includes input-oriented procedures that are described in Chapter 9 of my book combined with output-oriented procedures described in Chapter 10. The procedures are used to promote the consistent use of stimulable phonemes in the context of word shapes that are difficult for the child so that the focus is more on holistic movement patterns at the whole word level than on individual phonemes. In the case described here we taught novel “monster names” that had a strong-weak-strong stress pattern and word internal coda consonants such as “Biftenope” and “Hapnidreem” and assessed for carry-over to phrases with similar structures (pumpkin pie, bat mobile). 

One reason that we designed an intervention approach that focused on auditory-motor integration is that there is evidence from the animal literature suggesting that this might be a foundational problem in the case of apraxia. Kurt, Fisher and Ehret examined sensory-motor association learning in mice with two different FoxP2 mutations. The task involved learning to avoid electronic shock by leaping a hurdle (or not) to the other compartment of a box in response to varied tones that signaled the location of the shock. Mice with either mutation were impaired in their response, one more severe that the other, in comparison to wild-type mice that learned the task without difficulty. The second reason that we designed an intervention with an auditory-motor integration component is that the ability to modify motor plans in response to auditory feedback and in relation to an auditory target is theoretically essential to the acquisition of speech motor control.

So what does an intervention that focuses on auditory-motor integration look like? Not surprisingly it has procedures that focus attention on the auditory-perceptual aspects of speech as well as procedures that focus on motor practice, none of the procedures themselves being novel or surprising. During the prepractice portion of each treatment we ensured that the child had a good perceptual representation for the target words using auditory bombardment and focused stimulation in meaningful contexts as well as error detection tasks as described in my teaching blog (scroll down to week 22). We also taught the child to monitor his own speech and respond differentially to his own correct or incorrect productions of the target words. For example an appropriate activity might be for the child “call” the monster and to then place the monster in his sleeping bag in the tent if he heard himself produce the name correctly or to place the monster in an alternative sleeping bag out in the rain if he heard himself produce the name incorrectly (our students are endlessly creative and this variation on the game has proved to be popular with the children this year).  The practice part of the session, for the most part, proceeds as one would expect for any child with CAS, focusing on high intensity practice while the SLP provides just enough stimulation prior to each attempt to elicit a correct response more often than not. However, every effort is made to avoid providing too much feedback. Working in blocks of five trials each, summative knowledge of results is provided whenever possible – this means that the child is given an opportunity to evaluate his own responses in relation to his own auditory goal without interference from SLP input, and then compare his own judgment with the SLPs count of correct responses at the end of each 5 trial run. Edy Strand writes about the importance of giving the child time to integrate feedback in her chapter with Derbertine in Caruso and Strand (1999) and describes precisely how to do this. Given a high rate of responses (over 100 trials per 20 minute practice session) and an average of 70% correct responses, this child was able to make excellent progress as measured by both same day and next day probes (see green bars on his chart here). A second child with the same profile also showed a significant benefit in favour of this approach. A third child is still being treated and it will be some time before we will know if he completes the protocol and then many more months before blind coding of his results will be finished. But, we are hopeful!

Online Gaming and Speech Therapy

I have just read this marvelous paper tweeted out by @vaughanbell: Stafford, T., & Dewar, M. (2013). Tracing the Trajectory of Skill Learning With a Very Large Sample of Online Game Players. Psychological Science. He was impressed by the very large sample size (N = 854,064) but I am impressed by the relevance of this paper for speech therapy. The researchers used “detailed records of practice activity from an on-line game” and used it to test hypotheses about learning in the game which requires “rapid perceptual decision making and motor responses”. Gratifyingly for us as speech-language pathologists, the results confirm the principles of motor learning that are currently promoted for successful treatment of childhood apraxia of speech (CAS), specifically practice intensity, distributed practice and variable practice conditions (for application of these principles to the treatment of apraxia of speech see for example Gildersleeve-Neuman in the ASHA Leader or Tricia McCabe’s ReST program).

There was one concept raised in the paper that was a little bit novel with respect to the CAS literature however: specifically, the authors talk about the “exploration/exploitation” dilemma. In the context of this simple but bizarrely fun computer game (found here at The Welcome Collection)  you can explore the axon growing environment when first learning to play or you can settle into a strategy of simply clicking on the closest protein in your circle of influence. The latter strategy will work to grow your axon which is the object of the game but you will miss out on learning how to maneuver your circle of influence so as to actively find the “power proteins” that advance the growth of your axon. Exploration has a cost in that it leads to more variable performance early on but the benefit is potentially better performance with longer experience. In fact, Stafford et al. observed a close relationship between higher early variance in performance and better performance during later attempts. This trade-off between exploration and exploitation reminded me of the importance of the expansion stage in early speech development and the implications for intervention with young children with CAS.

In Table 10-1 of Developmental Phonological Disorders: Foundations of Clinical Practice we suggest learning outcomes and therapeutic strategies to correspond to four stages of speech development as follows: 1. Expansion stage (explore possibilities of the vocal system); 2. Babbling and integrative stage (controlled variability); 3. Early speech development (expanding repertoire of phones and word shapes to achieve intelligible speech); and 4. Late speech development (ongoing refinements to achieve adultlike speech accuracy and precision). These stages are described in greater detail in Chapter 3 which covers the literature on the development of speech motor control. The expansion stage typically occurs during months 3 through 6 and is characterized by a variety of vocalizations that are not very speech-like (squeals, growls, raspberries and so on) as well as the appearance of fully resonant vowels and marginal babble. It is my experience that SLPs do not appreciate the importance of the expansion stage to normal speech development or understand its significance when planning an intervention program for children with limited if any speech capacity. Therefore I highlight this point in Chapter 10, as follows:

“The importance of the expansion stage in the laying of building blocks for later speech development is easy to forget when choosing goals for speech therapy, a topic to which we return shortly. Another important achievement during the infant period is the acquisition of canonical syllables when the child learns to control the variable parameters explored during the expansion stage, coordinating them to produce well-formed syllables in the context of babble, jargon, and early words. …Typical descriptions of speech acquisition focus on reductions in variability with age. … Therefore, it is not surprising that traditional speech therapy procedures are designed to enhance consistency and reduce variability in the production of phonemes with practice. However, variability is not always an impediment to speech learning and children with DPD often suffer from insufficient variability in their repertoire of speech behaviors. Performance variability can be viewed as facilitating, detrimental, or irrelevant to a successful outcome depending on the motor learning context (Vereijken, 2010). For example, the highly variable vocalizations of the expansion stage provide a complex foundation for the emergence of speechlike vocalizations at later stages. Infants who are described as being “quiet” during the first year of life lack sufficient variability for normal motor speech development. The normally developing infant harnesses rather than reduces this variability to coordinate the separate respiratory, phonatory, resonance, and articulatory components to produce babble in the next stage. Throughout the next 16 or so years there will be a continual interplay between adaptive variability to meet new challenges and increased stability to enhance precision. (p. 758)”

 I often talk to SLPs who are frustrated by failed efforts to teach new phones via imitation to children with severe speech sound disorders. However children with limited vocal repertoires must first be encouraged to freely explore their vocal systems. I describe procedures to encourage vocal play in detail in the book, following Dethorne, Johnson, Walder, and Mahurin-Smith (2009) and supplementing with examples of implementation from my own clinical experience. I hope that Stafford et al.’s interesting research and this amusing little game leads to more reflection about the role of exploration and variability in speech motor learning.

Single Subject Randomization Design for CAS Intervention Research

I have recently returned from the very excellent Childhood Apraxia of Speech Symposium sponsored by the Childhood Apraxia of Speech Association of North America and held in Atlanta last month. The scientific presentations were wonderful and I hope to have posts related to many of them over the next few months. I begin by highlighting Larry Shriberg’s presentation as it relates to my current CASANA funded intervention study and I am, with some excitement, analyzing the data from the first cohort of participants this week since it is our winter break from teaching.

Dr. Shriberg presented data recently published in Clinical Linguistics and Phonetics (Shriberg, Lohmeier, Strand & Jakielski, 2012). In this paper the authors describe the use of the Syllable Repetition Task (SRT) for the identification of CAS. The paper, the test, and all the information you need for scoring and interpreting the test data is available for download at The Phonology Project website. The SRT consists of 18 items comprised of two to four syllables made up of the consonants /m, n, b, d/ and the vowel /ɑ/ and thus it is designed explicitly for children with speech delay. The task was administered to 4 quite large samples of children: Group 1, Typical Speech, Typical language; Group 2, Speech Delay, Typical Language; Group 3, Speech Delay, Language Impairment; and Group 4, CAS with this last group subdivided into idiopathic and neurogenetic etiological subtypes for some analyses. The test results were presented in the form of four scores: Competence, total percentage of correctly repeated consonants overall; Encoding Processes, percentage of within-class manner substitutions; Memorial Processes, ratio of sounds correct in 3-syllable-versus-2-syllable items; Transcoding processes, percentage of items containing one or more addition errors, subtracted from 100 for directional clarity. Most interestingly, the latter three scores were not correlated with each other within any of the groups although they were all moderately correlated with the competence score. The CAS group showed worse performance than the other three groups on all of these measures although their performance on the Transcoding processes measure was most distinctive. The diagnostic usefulness of the Transcoding score is much enhanced by also considering aspects of the children’s prosody in connected speech (inappropriate pauses, slow rate, lexical or phrasal stress errors). In conclusion, these findings were taken as evidence that CAS is a multiple domain disorder with low encoding scores reflecting incomplete or poorly formed phonological representations, low memorial scores reflecting difficulties with phonological memory, and low transcoding scores reflecting a motor planning/programming deficit. Given that the paper presents group data, and that the encoding, memorial and transcoding scores are not correlated with each other, it is not clear however that all children with CAS will show difficulties in all of these areas. It seems possible if not likely that there will be considerable heterogeneity within this population with different children showing variant profiles across these three speech processes. The purpose of our study is to consider this heterogeneity by examining response to three interventions in individual subjects.

In a previous post I mentioned an alternative to traditional single subject designs that does not require a stable baseline while allowing for statistical analysis. We are using one form of this design in this study, the single subject randomization design, more specifically set up as a randomized block experiment as described in my paper on the application of these designs to communication disorders research (Rvachew, 1988). We have six children participating in the study this winter and 3 more enrolled for the spring. I provide partial data for one child in this post simply as a way of demonstrating the usefulness of this design for research with low incidence disorders. The child is school age with borderline verbal and nonverbal IQ, speech delay, and ADHD. Apraxia of speech was confirmed by administration of the Kaufman Speech Praxis Test and maximum performance tasks revealing normal single syllable repetition rates but an inability to sequence three syllables consistently and at a normal rate. The results of the Syllable Repetition Task indicated an extremely low competence score despite encoding and memorial processing within the average range for his age. He did have difficulties with transcoding however as indicated by the characteristic addition of nasal consonants.

Three speech targets were selected for this boy: word internal codas, word-initial /l/ clusters, and word initial velar stops (with baseline performance in single word naming being 50, 29, and 33 percent correct respectively). All targets were addressed via pseudowords linked to nonsense referents in a functional context. All targets received 20 minutes of concentrated practice per week using the integral stimulation hierarchy as described by Christine Gildersleeve-Neuman. However, the prepractice condition (which was implemented for 20 minutes prior to the practice session) varied for each target. The three prepractice conditions being compared in this study were randomly assigned to the targets with the following result: word internal codas were treated using input oriented prepractice procedures, word-internal /l/ clusters were associated with sham prepractice procedures (control condition) and velar stops were treated with output oriented prepractice conditions. The input oriented prepractice conditions included auditory bombardment and error detection tasks as described by Rvachew and Brosseau-Lapre (see also Chapter 9 of our book, http://www.pluralpublishing.com/publication_dpd.htm). The output oriented procedures are described by Dodd and colleagues for improving the child’s ability to independently build a phonological plan for the word by linking syllables and phonemes to graphical cues and then chaining the subword units. Phonetic placement was also incorporated into this condition as needed.

Raw Session and Next Day Probe Scores for One Child By Treatment Condition

Raw Session and Next Day Probe Scores for One Child By Treatment Condition

In-keeping with the randomized block design, the child received three treatment sessions per week, with each treatment condition/treatment target pair assigned at random to one of the three days on a week by week basis. Two outcome measures were recorded: the child’s responses to imitative phrase probes that were administered at the end of the session to assess learning during a given intervention session, and the child’s responses to imitative phrase probes that were administered at the beginning of the next session to assess maintenance of learning. The child’s performance on these probes is shown on the figure below: pastel bars are the session probes indexing session performance and solid bars are the next day probes indexing maintenance of learning to the next session. Different colours represent different prepractice conditions. These probe scores were submitted to a nonparametric randomization test as described in Rvachew (1988) with the results indicating that there was no difference in probe performance at the end of each session as a function of prepractice condition, F(2,5) = 1.19, p = .392. However, there is a significant effect of prepractice condition when considering next day probe performance, F(2,5) = 23.01, p = .002. Now, I am going to make you crazy by not revealing which prepractice condition is associated with each colour! The reason is that this is just one child and I want to see the results for the other children –  I have observed the responses of the other children and have reason to believe that in fact there are differences in actual learning as a function of prepractice condition but we will feel more confident after having blinded transcriptions of probe data from more children. It should be obvious with this design that there are many other variables that can influence the outcome such as intrinsic differences in the difficulty of the targets, differences associated with the days of the week, and differences in clinician (although some of the same people were in the room during every session, the treating clinician was not the same during every session). Therefore we need to replicate the result many times before we feel confident interpreting these results. However, I wanted to introduce readers to the SRT, the notion of CAS as a multiple domain disorder, and the single subject randomization design as a way of looking at the relationship between response to intervention and underlying psycholinguistic profile. I hope that you will stay tuned – we hope to take data from the first six children to ASHA13.

MMN and Speech Therapy

I haven’t had time to much time to blog these past four months because I have been starting a new treatment study, this time on Childhood Apraxia of Speech. Quite a few papers on this topic have caught my eye and now that our project is ready for lift-off I am going to comment on one paper that is particularly relevant to the interventions that we will be comparing. The paper by Froud, K., & Khamis-Dakwar, K. (2012) concludes that “there is some phonological involvement in CAS and that CAS cannot be characterized as a purely motoric disorder” and claims to be “the first investigation to utilize neurophysiological methodologies to examine the neural underpinnings of primary CAS” (p. 310). We will be comparing motor speech practice alone to approaches that combine motor speech practice with prepractice procedures designed to strengthen children’s underlying phonological representations at the acoustic-phonetic or articulatory-phonetic levels and thus the paper was of interest to me. I was surprised to find a mistake in the authors’ representation of prior work on the development of MMN responses to phonological categories in their introduction however. They begin by explaining that the mismatch negativity (MMN) response is an “automatic, preattentive [evoked potential] response to stimulus change that can be elicited in the absence of conscious attention”. This research technique is used in important infant perception research and therefore Francoise and I explain the technique carefully and present some seminal research findings in detail in our book Developmental Phonological Disorders: Foundations of Clinical Practice. Froud and Khamis-Dakwar mention one of these seminal studies, Nätäänen,Lrthokoski, Lennes et al. (1997), correctly reporting that adult Finnish-speaking participants in this study showed larger MMN responses to native language vowel contrasts than to stimulus changes that did not cross a phoneme boundary in their native language (one of the contrasts in question was phonemic in Estonian but not in Finnish). Then Froud and Khamis-Dakwar strangely report that Estonian research participants did not show MMN responses to the same stimuli which makes so little sense in relation to the actual study findings that there must be a typo involved. Whatever the source of the error, the whole flavour of the paper reinforces rather than dispels common misperceptions about early phonological development in general and MMN research in particular.

A common misreading of the literature is the notion that responses to non-native phoneme contrasts are lost in infancy – this is simply not true and the MMN research program led by Nätäänen and colleagues  illustrates this perfectly. Brain responses to non-native phoneme contrasts are retained throughout the lifespan although they reflect the purely acoustic properties of the input and are diffuse, bilateral and weaker than those observed in response to native language phoneme contrasts. During the first few years of life brain responses to language specific inputs reorganize to become more focal, left-dominant and efficient. The size of the MMN early in life reflects acoustic properties of the stimulus whereas the size of the MMN later in life reflects the linguistic (phonological) properties of the stimulus.  We cite other research in our book from labs using other techniques (fMRI, MEG etc) that reinforce this point.

Froud and Khamis-Dakwar report that a group of 5 children with CAS (aged 5 to 8 years) showed a larger MMN to a phonetic contrast (VOT 50 ms vs VOT 75 ms) and no MMN to a phonemic contrast (VOT 50 ms vs VOT 5 ms). The age matched comparison group with normal speech development showed the opposite pattern of results. The data are more-or-less uninterpretable given that they have no information about the children’s language or cognitive skills and the findings are not consistent with their hypothesis even though the  MMN responses for the CAS group differed from those of the comparison group. I expect that the means mask much heterogeneity in responding within their small group of participants. They claim that some aspects of the CAS group’s responses suggest persistence of immature acoustic processing (it is a little hard to see this because the stimuli are not properly controlled for acoustic distance as in the Näätänen et al studies).  They still go on to claim that the findings have implications for theoretical perspectives regarding the etiology of CAS and suggest that the results are compatible with the possibility that a primary phonological deficit is the causal mechanism. This brings me to the second common misperception about research on early speech perception development.

This idea that MMN data can tell us something about the ‘neurophysiological’ underpinnings of CAS is pervasive but perverse. The notion seems to be that these responses are somehow more ‘biological’ than behavioral tests that indicate difficulties with phonological processing and language among children with this disorder. For the life of me I cannot imagine why. MMN research with normally developing children is fascinating because it illustrates beautifully the impact of environmental inputs on the reorganization of brain responses to linguistic stimuli that are co-incident with changes in behavioral responses to those same linguistic inputs. If (and this is a really big if given the state of this literature) rather old children with CAS have immature MMN responses that suggest acoustic rather than phonological processing of speech input, what do these responses tell us about the etiology of the  speech deficit in CAS? Well, not much really. A large part of any individual’s responses to speech input is formed by that individual’s experience with speech input. Children with CAS are probably not experiencing linguistic input in the same way as other children and thus it is not surprising to find differences in their MMN responses relative to normally developing children. What are the possible interpretations of this finding?

First, as Barbara Lewis has suggested, at least some children with CAS may have a more severe version of a developmental phonological disorder. Although this population is heterogeneous there is genetic overlap with dyslexia and ERP studies of new-born infants with dyslexic parents do in fact confirm a primary problem with speech processing in this population. In addition to speech therapy targeting the articulatory component, the child will need therapy to improve  phonological representations and efforts must be made to ensure a rich language environment for the child.

Second, children with CAS tend to not babble and to talk late. The shift from language-general to language-specific processing of speech in infancy is dependent upon the social context in which speech input is provided. It is very likely that children who do not vocalize in the normal way are not receiving the usual linguistic and social inputs during this sensitive period. I remember working with a mum who was depressed because her baby “only growled at her” (it was true – the infant’s only non-cry vocalization was a very low pitched growl, in this case secondary to chronic otitis media in the first six months of life). It took a bit of coaching to teach the mum to perceive and respond to the growls positively but what a difference that made – full babbling emerged shortly after the mum began engaging in reciprocal vocal interactions with her baby. Vihman has hypothesized that the infant’s own vocalizations play a role in the perceptual salience of the input as well. An important aspect of speech perception development is the linking of acoustic to articulatory representations via the dorsal stream, a developmental event that appears to begin with the onset of babbling. Therefore, immature MMN responses to phonemic stimuli may still reflect a primary problem in the motoric rather than perceptual or phonological domains. That is not to say that the intervention must be solely focused on the child’s motor skills. Clearly the intervention must be concerned with the quality of linguistic inputs to the child and the social context in which those inputs are provided to the child even when the core deficit appears to be motoric in nature.

Thirdly, the biological events that produced the core motor deficit may produce additional “core deficits” in other domains as is probably the case in FOXP2, a syndrome that associates multiple cognitive and linguistic deficits with severe oro-motor and speech dyspraxia. Once again the child will require a broad based intervention program, targeting multiple levels of representation, although the focus of therapy will shift with the child’s developmental needs over time.

Overall I am happy that this paper focuses attention on the requirement to attend to these children’s perceptual and phonological representations in therapy. I am not sure that I agree that it is actually necessary to understand the underlying etiology to design an effective intervention for the child; rather it is important to fully understand normal speech development and to be able to determine the child’s specific developmental needs as they unfold over time. I definitely don’t agree that MMN studies illuminate the neurophysiological underpinnings of CAS.



Cheour, M., Ceponiene, R., Lehtokoski, A., Luuk, A., Allik, J., Alho, K., & Näätänen, R. (1998). Development of language-specific phoneme representation in the infant brain. Nature Neuroscience, 1, 351-353.

Froud, K., & Khamis-Dakwar, K. (2012). Mismatch negativity responses in children with a diagnosis of childhood apraxia of speech (CAS). American Journal of Speech-Language Pathology, 21, 302-312.

Hickok, G., & Poeppel, D. (2004). Dorsal and ventral streams: a framework for understanding aspects of the functional anatomy of language. Cognition, 92(1-2), 67-99).

Kuhl, P. K., Conboy, B. T., Coffey-Corina, S., Padden, D., Rivera-Gaxiola, M., & Nelson, T. (2008). Phonetic learning as a pathway to language: new data and native language magnet theory expanded (NLM-e). Philosophical Transactions of the Royal Society, 363, 979-1000.

Lewis, B. A., Freebairn, L. A., Hansen, A. J., Iyengar, S. K., & Taylor, H. G. (2004). School-age follow-up children with childhood apraxia of speech. Language, Speech, and Hearing Services in Schools, 35, 122-140.

Lyytinen, P., Eklund, K., & Lyytinen, H. (2005). Language development and literacy skills in late-talking toddlers with and without familial risk for dyslexia. Annals of Dyslexia, 55(2), 166-192.

Näätänen, R., Lrthokoski, A., Lennes, M., Cheor, M., Houtilainen, M., Iivonen, A., . . . Alho, K. (1997, January 30). Language specific phoneme representations revealed by electric and magnetic brain responses. Nature, 385, pp. 432–434.

Vargha-Khadem, F., Gadian, D. G., Copp, A., & Mishkin, M. (2005). FOXP2 and the neuroanatomy of speech and language. Neuroscience, 6, 131-138.

Vihman, M. M. (2002). The role of mirror neurons in the ontogeny of speech. In M. Stamenov & V. Gallese (Eds.), Mirror Neurons and the Evolution of Brain and Language (pp. 305-314). Amsterdam, Netherlands: John Benjamins Publishing Company.