Prosody and CAS

I am reading a relatively new paper about Treatment for Establishing Motor Program Organization (TEMPOSM) in childhood apraxia of speech (CAS) by Miller et al. It is a good paper showing that this treatment is effective for reducing segmentation between syllables and improving stress contrast when producing multisyllable nonsense words. There were 11 children and a novel design so the quality of the paper is good. I think the work this group is doing on the treatment of CAS is important.

That said, there is something that is bothering me about the CAS literature (beyond the bizarre adherence to schema theory by EVERYONE, see Speech Therapy and Theories of Speech Motor Control: Part I | Developmental Phonological Disorders.)

What is bothering me that is that we (including me) keep writing that prosody is a core feature of CAS but when you go looking for evidence for that it is really hard to find. Miller et al say that “acoustic measures of the speech of children with CAS show evidence of a disruption in temporal control of speech, marked by increased duration and reduced variability in duration of speech segments.” Seven papers are cited to support disruptions in the duration of speech segments by children with CAS. I have read all those papers many times but I decided to go back to them again. I summarize the evidence here.

In the first paper, Ballard et al describe a prototype of the ReST treatment that targeted syllable stress in 3 children with good effects for at least 2 of the children. But is the problem with stress marking exhibited by these three children a core feature of CAS?

The second paper used a systematic procedure to show a deficit in planning syllables during speech production, recruiting children with apraxia and children with normal speech development. Niland et al were interested in intra- versus inter-syllabic coarticulation strength and duration structure across the two groups of children. The measures related to coarticulation did not reveal differences in this study. However, differences emerged for the durations of specific segments as a function of prosodic structure of the phrases. That is, in structures like CVs#CVC the duration of the consonants and vowels will differ compared to CV#sCVC because total syllable duration is normally taken into account. The expected metrical adjustments were less systematic in the speech produced by children with CAS. The difficulty of the CAS group is attributed to a problem with ‘phonetic encoding’. This study has the strongest evidence of all the papers cited. It would be interesting to see the result for other children with a speech sound disorder.  

The third paper proposes a coefficient of variation as a marker for CAS. This idea is very popular and I have cited this paper fairly often. It is easy to forget what this coefficient is however. One part of the equation refers to duration of speech events (parts of the acoustic waveform that are speech) while the other part refers to duration of pause events (parts of the acoustic waveform that are not speech, or pauses). The ratio of these two durations is the CVR and this metric is different because the CAS group had less variation in the duration of speech events and more variation in the duration of pause events, when looking at effect sizes, although the differences were not statistically different, comparing speakers with normal speech, speech delay, and CAS. So maybe not a difference at all, and if so, a difference in what?

A related paper, also by Shriberg and colleagues, introduces the lexical stress ratio which seems self-explanatory. The abstract to this paper says that the upper and lower extremes of this ratio were associated with CAS (compared to speech delay) and that the results reflect “the prosodic consequences of a praxis deficit in speech motor control.” I defy any reader to find the evidence for that conclusion in the tables and figures. Seriously. Even after you get through the problems with reliable classification of the children and the stress patterns. I should stop citing this paper.

Munson et al also looked at lexical stress, comparing instrumental measures and perceptual judgments. Their conclusions are interesting. Instrumental measures such as vowel duration and fundamental frequency revealed no differences between the speech production of children with CAS versus phonological disorder. There were some differences detected on the basis perceptual judgment. The authors speculate that the greater number of articulation errors in the speech produced by the CAS group accounts for this finding. This I think is a very important observation. It is difficult to equalize these groups of children on all the variables that may account for listener judgments.

Finally, Velleman and Shriberg conducted a very careful study of lexical stress differences, comparing the production of various metrical patterns by children with CAS or speech delay. Lexical stress errors were coded by ear, specifically identifying syllable omissions and vowel augmentations. The frequency of these errors was roughly similar across the groups in this study although the errors persisted to a greater age in the groups of children with apraxia of speech.

So, my point is that these papers provide weak evidence of a deficit in prosody as a core feature of CAS —that is very weak to no evidence. People hear something that sounds like a problem with prosody when they listen to these children. Acoustic measures are more likely to reveal a difference than perceptual evidence. Comparison to children with normal speech development is more likely to reveal a difference than comparison to children with speech delay/phonological disorder. It has been suggested that the impression of prosodic disruptions may stem from the greater number of articulation errors in the children’s speech.

I have a question about this issue of prosodic disruptions. Let’s say that there are prosodic disruptions (we know that at least some children have difficulty producing normal sounding lexical stress). How do they arise in the child’s speech? Is the cause the same in every child’s speech? Several of the studies suggest a difficulty with a high level stage of speech planning and lexical retrieval (see Kircher et al 2003). Here it is important to find the right word and plan the syllables in the right order with the right stress pattern. Young children and even adults produce errors due to the selection of wrong words or wrong syllable templates during phonological planning. Subsequently, planning and execution of the speech gestures to produce the intended syllables should result in the utterance. However, mistiming of these elements – loudness, duration, pitch – may result in anomalies: syllables that are longer than they are supposed to be or syllable breaks that are misplaced or syllables produced with roughly equal pitch. Two children who produce speech with disrupted prosody might do so for the same reason or for different reasons. A clear standard for measuring and comparing prosody in children might help to sort that out.

References in order of appearance:

Miller, H. E., Ballard, K. J., Campbell, J., Smith, M., Plante, A. S., Aytur, S. A., & Robin, D. A. (2021). Improvements in Speech of Children with Apraxia: The Efficacy of Treatment for Establishing Motor Program Organization (TEMPOSM). Developmental Neurorehabilitation, 24(7), 494-509. doi:10.1080/17518423.2021.1916113

Ballard, K. J., Robin, D. A., & McCabe, P. (2010). A treatment for dysprosody in childhood apraxia of speech. Journal of Speech, Language & Hearing Research, 53, 1227-1245.

Nijland, L., Maassen, B., van der Meulen, S., Gabreëls, F., Kraaimaat, F. W., & Schreuder, R. (2003). Planning of syllables in children with developmental apraxia of speech. Clinical Linguistics & Phonetics, 17(1), 1-24. doi:10.1080/0269920021000050662

Shriberg, L. D., Green, J. R., Campbell, T. F., McSweeny, J. L., & Scheer, A. R. (2003). A diagnostic marker for childhood apraxia of speech: the coefficient of variation ratio. Clinical Linguistics & Phonetics, 17, 575–595.

Shriberg, L. D., Campbell, T. F., Karlsson, H. B., Brown, R. L., Mcsweeny, J. L., & Nadler, C. J. (2003). A diagnostic marker for childhood apraxia of speech: the lexical stress ratio. Clinical Linguistics & Phonetics, 17(7), 549 – 574.

Munson, B., Bjorum, E. M., & Windsor, J. (2003). Acoustic and perceptual correlates of stress in nonwrods produced by children with suspected developmental apraxia of speech and children with phonological disorder. Journal of Speech, Language, and Hearing Research, 46, 189-202.

Velleman, S. L., & Shriberg, L. D. (1999). Metrical analysis of the speech of children with suspected developmental apraxia of speech. In Journal of Speech, Language, and Hearing Research (Vol. 42, pp. 1444-1460).

Kircher, T. T. J., Brammer, M. J., Levelt, W., Bartels, M., & McGuire, P. K. (2004). Pausing for thought: engagement of left temporal cortex during pauses in speech. NeuroImage, 21(1), 84-90. doi:


Goal Selection for Progressive Phonological Change

On April 28, 2022 I presented a workshop, ‘Assessment of Children with Speech Sound Disorders (SSDs): Identification of Subtypes’, at the SAC Speech-Language Pathology Conference. The conference was meant to be in-person but was switched to virtual format at the last minute and therefore some parts of my planned presentation did not go as planned, in particular, a exercise on target selection. I promised to put the speech sample and the “answer” to the activity in my blog. Notwithstanding the fact that there is no one right answer to the activity I am doing that today. I am sorry to be over two months late getting around to this but I find it hard to keep up with the blog now that I am school director. Before I proceed I will point out that I did publish another blog based on that workshop to highlight sources (specifically test tools). Also, I have written several blogs previously on target selection, at least one describing the particular rubric that I am recommending here. The proposed rubric for selecting goals is drawn from a paper published by Pamela Grunwell in 1992 in which she discusses Processes of Phonological Change in Developmental Speech Disorders. My adaptation of this process to modern phonological analysis and treatment approaches is described in detail in both books authored with Francoise Brosseau-Lapre, specifically Developmental Phonological Disorders: Foundations of Clinical Practice as well as the undergraduate textbook Introduction to Speech Sound Disorders.

Selecting Treatment Goals

The process for selecting treatment goals to effect progressive phonological change has the following steps:

  1. Obtain a sample of the child’s speech and conduct a phonological analysis, preferably using a non-linear analysis, to identify strengths and needs at all levels of the phonological hierarchy, typically focusing on word shapes, syllable structure, and features.
  2. Classify patterns of mismatch (error) in the child’s speech as: a. variable (mismatches are inconsistent); b. context specific (structure is correct in one context but consistently incorrect in another); or c. consistent (the structure is never produced correctly).
  3. Select one treatment goal from each of these categories so that you will be targeting a STABILIZE goal to improve intelligibility quickly by reducing variability in the production of one structure; b. an EXTEND goal by extending production of a structure from one context where it is correct to another where it is not; and, an EXPAND goal by introducing a new structure that is currently absent from the child’s system.
  4. The three structures can be targeted by using a vertical, cycles, or horizontal goal attack strategy. My preference is for horizontal with a cycles component. That is target 3 all at the same time for a block of predictable length (e.g., six or twelve weeks); then switch to three new targets, again chosen across the three types of goals.

Practice Activity

Step 1. A speech sample constructed for a hypothetical three-year-old children with moderate expressive language delay and mild receptive language delay. Notice that CV, CVC, CVCV, and CVCVC word shapes are present along with glides, stops, nasals, and fricatives.

swimming    [wɪmi]tight[dɑɪt]kite[dɑɪt]

Step 2. Identify patterns belonging to the three major types. Here phonological patterns are described using familiar terminology for the most part:

Weak syllable deletion 50%Velars fronted in onsetsClusters reduced
Nasals deleted from codas 30%Stopping of fricatives in onsetsLiquid gliding
Unstable -voice in onsets 75%Gliding of nasals in onsetsPalatal fronting

Step 3. Select one goal from each column, that is, variable pattern, context specific pattern, and consistent pattern. If I remember my statistics classes from over 30 years ago correctly, random selections would yield over (3!)x(3!)x(3!)=216 possible answers and they would all be fine but maybe we can be a bit more systematic than that. Here is my suggestion (but you can disagree with me in the comments!)

Stabilize production of voiceless stops in the onset position.Extend production of velar stops from coda to onset.Expand the system to include /s/ clusters, starting with the coda position.

These recommended goals are not necessarily the “best” ones but they are designed to be complementary. The voiceless stops in the onset and the /s/-clusters in the coda might might support the appearance of fricatives and /s/-clusters in the onset. This is especially true if the voiceless stops are produced with proper aspiration. The work on velar consonants in the onset will permit focus on the voice/voiceless contrast as well.

A completely different set could have been chosen to centre on the gliding of nasals in the onset given the cost to intelligibility. This choice would impact the choices in the stabilize and expand categorize. You can think about this. You will need another set for your second “block” of treatment sessions in any case.

Notice that the rubric ensures that you will have a range of treatment goals that covers different levels of the phonological hierarchy and a spectrum of complexity. Importantly you will have several goals that reinforce change across the child’s phonology. The rubric works for young children or children with severe impairments and constrained phonologies. It also works for older children with less severed phonological impairments. It works for a variety of treatment approaches. It is very versatile. You can make choices and feel confident that you have a good justification for your treatment plan.

If you try it you will have questions however. Just ask me. I am happy to answer.

SAC 2022: Follow-up on Assessment Tools

I presented a workshop at the SAC 2022 conference entitled “Assessment of Children with Speech Sound Disorders: Identification of Subtypes.” A recording of the session will be made available to conference participants and a slide handout was provided. Participants asked for more information about how to access the assessment tools that I discussed. With the exception of the Syllable Repetition Task, I presented the assessment tools as examples of the types of assessments you might want to administer, rather than recommendations of the specific tools you need to use. You can look for other tools or use the tests you are currently comfortable with. However, I will provide links to more information here about the specific tools that I mentioned.

First, the approach to assessment and interpretation of the assessment results was described in detail in the following paper (contained in an open-access special issue on apraxia of speech):

Rvachew, S., & Matthews, T. (2017). Using the Syllable Repetition Task to reveal underlying speech processes in Childhood Apraxia of Speech: A tutorial. CJSLPA, 41(1), 106-126

During the presentation I describe three subtypes of speech sound disorder that are differentiated by the primary underlying processing impairment, either phonological processing, phonological planning, or motor planning. The profile of assessment results associated with subtype was described in relation to a profile of speech behaviors, oral-motor exam performance, Syllable Repetition Task results, and phonological processing as assessed by measures of speech perception and phonological awareness. I will not review the subtype specific profiles here but I will put links to help readers find the assessment tools.

Speech Behaviors

Both standardized and nonstandardized elicitation of speech in single word and connected speech contexts is required to describe the child’s speech. One tool that we have found useful is the:

Dodd, B., Zhu, H., Crosbie, S., Holm, A., & Ozanne, A. (2006). Diagnostic Evaluation of Articulation and Phonology (DEAP): Pearson Education.

This test is useful because there is an Articulation Test (including accuracy of consonants and vowels), a Phonology Test (including single word and connected speech elicitation), and a Word Inconsistency Test. Detailed instructions permit the user to identify which subtype of speech sound disorder the child might have according to Barbara Dodd’s scheme, that is Articulation Disorder, Delayed Phonological Development, Consistent Phonological Disorder, or Inconsistent Phonological Disorder.

Oral-Motor Performance

A measure of simple and complex non-speech and speech movements is essential. There are many to choose from with the choice determined by the age and capability of the child:

DEAP Oral-Motor Screen (part of the Dodd et al., DEAP Articulation Test)

Fletcher, S. G. (1972). Time-by-count measurement of diadochokinetic syllable rate. Journal of Speech and Hearing Research, 15, 763-770.

Robbins, J., & Klee, T. (1987). Clinical assessment of oropharyngeal motor development in young children. Journal of Speech and Hearing Disorders, 52, 271-277.

Thoonen, G., Maassen, B., Wit, J., Gabreels, F., & Schreuder, R. (1996). The integrated use of maximum performance tasks in differential diagnostic evaluations among children with motor speech disorders. Clinical Linguistics & Phonetics, 10, 311-336.

Thoonen, G., Maassen, B., Gabreels, F., & Schreuder, R. (1999). Validity of maximum performance tasks to diagnose motor speech disorders in children. Clinical Linguistics & Phonetics, 13, 1-23.

Williams, P., & Stackhouse, J. (2000). Rate, accuracy and consistency: diadochokinetic performance of young, normally developing children. Clinical Linguistics & Phonetics, 14, 267-293.

St. Louis, K. O., & Ruscello, D. M. (2000). Oral Speech Mechanism Screening Evaluation-Third Edition (3rd ed.). Austin, TX: PRO-ED.

 Rvachew, S., Hodge, M., & Ohberg, A. (2005). Obtaining and interpreting maximum performance tasks from children: A tutorial. Journal of Speech-Language Pathology and Audiology, 29(4), 146-156.

Rvachew, S., Ohberg, A., & Savage, R. (2006). Young children’s responses to maximum performance tasks: Preliminary data and recommendations. Journal of Speech-Language Pathology and Audiology, 30(1), 6-13.

Syllable Repetition Task

A tutorial to describe how to score this test is provided in Rvachew & Mathews (2017) linked above. Tanya Matthews also made a video tutorial that is currently at Online Courses tab of the International Association of Communication Sciences and Disorders but it is “members only” access unfortunately. I hope to get this on my own website in the future for broader distribution. The materials to administer and score the SRT are available at the Child Phonology Project Website as follows:

PowerPoint of SRT. Note: After opening the Powerpoint document, push F5 to play slideshow. This powerpoint includes audio files; turn your speakers on to hear the presentation of the task.

Shriberg, L. D., Lohmeier, H. L., Campbell, T. F., Dollaghan, C. A., Green, J. R., & Moore, C. A. (2009). A nonword repetition task for speakers with misarticulations: The Syllable Repetition Task (SRT)Journal of Speech, Language, and Hearing Research, 52, 1189-1212.

Lohmeier, H. L. & Shriberg, L. D. (2011). Reference Data for the Syllable Repetition Task (SRT). (Tech. Rep. No. 17). Phonology Project, Waisman Center, University of Wisconsin-Madison.

Phonological Processing

There are many tests of phonological awareness that can be used and therefore I will not list them all here. However there is a problem with the lack of access to measures of speech perception. One measure that is available for children aged 4, 5, and 6 is SAILS with stimuli available for children who speak North American or Australian English. SAILS is available on the App store for download to iPads. Search for SAILS Rvachew or SAILS on the App Store and the page for the app should come up.

You can create your own test using a procedure described in this paper:

Locke, J. L. (1980). The inference of speech perception in the phonologically disordered child. Some clinically novel procedures, their use, some findings. Journal of Speech and Hearing Disorders, 45, 445-468.

To finish up I mention that this assessment approach and these procedures are described in the book shown below along with detailed descriptions of the treatment approaches that I mentioned:

Rvachew, S., & Brosseau-Lapre, F. (2018). Developmental Phonological Disorders: Foundations of Clinical Practice (Second ed.). San Diego, CA: Plural Publishing, Inc.

The advantage of having the book is the amount of detail that is provided about administration and interpretation. The norms for several of these procedures are reproduced in the book as well. Regarding intervention procedures, this is one of the few clinical texts that describes exactly how and when and why to implement specific intervention procedures. One important message that I tried to convey during my workshop was that children who correspond to these different subtypes require a different approach to treatment. I wasn’t able to teach the type of intervention that they need but the intervention approaches that each child will respond to is quite different. Therefore, it is important to understand the procedural details.

Phonological Analysis in Private SLP Practice

In the previous blog post in this series I talked about the cost of obtaining a full kit of assessment materials for targeting your private practice at young children with speech sound disorders. Another important cost is the indirect time associated with each assessment. This is time well spent however because a proper analysis of an in-depth assessment will ensure that the treatment plan is effective. A good outcome requires not only time to analyze assessment results however: it requires expertise which brings us back to the topic of my first post in this series. I am aware that many practitioners and academics in our field are rather dismissive of “articulation”, thinking that it is easy to provide services in this area but in fact a considerable depth of expertise is required to fully understand children’s patterns of speech error, and more importantly, the underlying causes of those error patterns. I frequently tell my students about notable mistakes that occurred when I failed to make an in-depth analysis of a child’s speech. These errors resulted from over-confidence: often a speech sample appears to provide easy answers, the patterns appear to be obvious when they are not. Similarly, when SLPs are struggling to achieve progress with a child they will put out calls for help that assume the original problem description is accurate, e.g.,: “I am treating a four year who substitutes [k] for /t/ and I am not having much success. Can anyone help me?”

Their colleagues rush into help, despite the limited information, and assume that their own favourite procedures will be more effective. What if the description of the error pattern is faulty or a specific type of treatment approach is required to address the child’s underlying issues? Let’s examine the number of different possibilities, given this brief description.

  1. The child may actually substitute [t] for /k/ (SLPs describe these patterns backwards all the time as I have discussed before). Does the child also substitute [d] for /ɡ/ and [n] for /ŋ/? In this case, we have an ordinary velar fronting pattern. If there are other phonological processes and the child has good underlying phonological knowledge of the contrasts, the cycles approach is an excellent choice, keeping in mind the required intensity of treatment.
  2. Does the child have poor perceptual knowledge of the error contrasts? Regardless of whether the error is fronting or backing, it is essential to find out and this can be done using the SAILS tool (see Many children with a phonological disorder have problems with phonemic perception and will benefit from an input-oriented approach as described in our book (see also Rvachew & Brosseau-Lapre, 2015).
  3. Is the backing of velars context-specific? Camarata and Gandour (1984) describe an interesting case in which the child produced an alveolar stop before high vowels (e.g., tea, key→[ti]) but a velar stop before non-high vowels (e.g., cup, duck→[kʌp],[kʌk) and therefore the pattern provided evidence of an allophonic rule. A structured meta-phonological approach might be required to correct this kind of pattern.
  4. Another context-specific pattern involves assimilation. Perhaps the child produces the error only in those contexts that promote spreading of the Dorsal feature within a syllable (e.g., “chalk” → [kak]) or within a complex onset (e.g., “train” → [kweɪ]). [Good knowledge of multilinear phonology is required to understand and identify these patterns]. These errors are very specific to their contexts and cannot be corrected using procedures that are directed at a phonological process called “backing”. These assimilation errors usually self-correct as the child’s phonemic repertoire expands and stabilizes. The method of meaningful minimal pairs can be used to support the development of new phonemic contrasts. These methods are described in detail in our book.
  5. Is there evidence of an undifferentiated lingual gesture as described by Gibbon (1999)? In this case the alveolar and velar consonants might be produced with abnormal tongue movements such that the whole body of the tongue and the tongue blade rise to the palate and alveolar ridge to achieve closure of the vocal tract, followed by variable opening gestures. There may be other indications of articulatory difficulties such as distorted fricative sounds, reduced tongue strength, and/or slow single-syllable repetition rates. The results of the oral-motor examination and observation of the child’s tongue movements during speech will play a large role in differentiating an articulatory cause from a phonological or perceptual basis for the child’s speech error pattern. Clearly those children who have difficulty producing typical tongue movements during speech will need a therapeutic approach with an articulatory focus involving phonetic placement or visualization of the tongue.
  6. Are the apparent backing errors due to metathesis and other inconsistent word productions? For example, the child might be trying to say “helicopter” and produce [hɛkɪkɑpɚ], [hɛlɪpɑkɚ], and [hɛtɪlɑtɚ]. In this case there is no consistent pattern of substitutions related to the /t/ and the /k/ but a significant problem with the representation of the phoneme sequence for this word, or, with the planning of the phoneme sequence for this word prior to setting up the motor plan. This is referred to as inconsistent phonological disorder. It can be identified by administering the Word Inconsistency Assessment and by testing for deficits in phonological memory. The appropriate treatment is the Core Vocabulary Approach.
  7. Finally, backing errors are often considered to be a sign of Childhood Apraxia of Speech. However, this diagnosis would be appropriate only if there were other signs in the child’s speech such as segregation errors, distortion errors, and dysprosody. Furthermore, you would expect a slow three-syllable repetition rate in the presence of intact single-syllable repetition skills. The number of possible treatment approaches are too numerous to list here and would depend upon many aspects of the child’s profile. Often high-intensity practice of nonsense syllables forms part of the intervention however as described in the context of sensory-motor approaches for younger and older children in Chapter 10 of our book. See also RCTs by Murray, McCabe and Ballard (2015) and also Rvachew and Matthews (2019).

I have not explained how to conduct the detailed analysis that would reveal the different patterns that I have described here (again, our book lays out the procedures for multilinear analysis as well as the components of a deep assessment). The point is that designing a treatment program for a child with a speech disorder is not “easy”. It takes time and skill in every case. Planning for (and charging for) this time-consuming processing is an essential part of providing excellent service in a private practice. We can all feel proud of our ability to make these complex decisions. We must fight for the time, in public and private practice, to do this properly.

Phonology Assessment in SLP Private Practice

I am continuing with a series of blogs about private practice speech-language pathology. Fortunately many new clinicians can join an established practice and won’t have to think about these things. More frequently however young SLPs are setting up their own practice right out of school or very early on, in their home during their maternity leave for example. The focus of this blog is an aspect of setting up your practice that I am not seeing much advice about in other blogs (maybe I missed blogs about this, please write with links if I have). There other excellent blogs on setting up a private practice, for example 5 Key Steps to Start a Speech Therapy Private Practice ( ASHA also has a site the links to many key resources on the topics including ethics issues and quality indicators for your practice: Private Practice in Speech-Language Pathology ( I recommend these sources, but I am going to fill in a small but critical gap and that is the necessity of having the appropriate assessment materials on hand before you begin. And because you must obtain proper copyright to the materials or buy them (do not borrow them from your other employer!), I will provide some costing information.  In the last blog I talked about the importance of expertise and specializing in those clients you are most qualified to serve. Therefore, I will focus on speech sound disorders in children aged approximately 3 through 8 years.

The need to have assessment materials should be obvious. The practice guidelines world-wide indicate that your treatment goals and plans must be based on the results of a comprehensive assessment. Often times I see SLPs trying to select goals and treatment methods without having the results of a comprehensive assessment to guide their choices. How does this happen? There are so many reasons, almost too many to count but a few of them are unique to the private practice environment. Clients may not want to pay for the time it takes to assess when they are so desperate for treatment and their insurance may cover only 6 sessions. And the SLP may not want to pay for assessment materials when they are so expensive and observation can be valuable. Free observation is not a substitute for systematic assessment and analysis of the data in any circumstance. In the next blog I will demonstrate all the ways that superficial observations can be misinterpreted or at least differently interpreted. Even if the data is a detailed speech sample, a transcription and phonological analysis will be required. So which assessment tools are minimally required? Francoise and I developed a rubric for this, shown as Figure 5-1 in our DPD text and Figure 3-2 in our IntroSSD text. I will show the types of assessments in the table below, including those that are mandatory and those that are optional* and suggest options for each with free and commercial sources indicated.

ConstructPossible TestSource
Contextual factorsCase historyDPD text Publications :: Plural Publishing
Articulation accuracyDEAP Articulation TestDiagnostic Evaluation of Articulation and Phonology (DEAP) (
StimulabilityDEAP or informalInformal is fine
Oral-motor screenDEAP or other publishede.g., DPD has 3
Speech accuracy in continuous speechReference data for Percent Consonants Correct) Shriberg et al.Reproduced in DPD or see Austin & Shriberg (1997)
Hearing acuityHearing screeningFree apps: HearScreen — THE AUDIOLOGY PROJECT
Phonology*DEAP or hand scored from conversational sampleSee above or DPD
Word Inconsistency*DEAPSee above
IntelligibilityIntelligibility in Context ScaleOverview – Multilingual Children’s Speech (
Speech Perception*Speech-Production Perception Task or SAILSFree procedure in DPD or see
Phonological Awareness*Phonological Awareness Test (implicit)Free with norms in DPD
Nonword Repetition*Syllable Repetition TaskOverview – The Phonology Project – UW–Madison (
Language screene.g., QUILS (3 to 5 yrs) or SPELT (4 to 9 yrs) Or story retell with SALTLanguage Screening Tools – QUILS (
SALT Home Page (
Intelligence screen*Kaufman Brief Intelligence TestClinical Assessment Canada – English (

This list of required test materials looks lengthy but the ultimate cost is quite moderate. Instructions and normative data for the case history, the oral-peripheral examination, articulation and phonological analysis for toddlers through school age children (with normative expectations), speech perception testing and an implicit awareness test are all tucked inside the DPD text which can be obtained for $150.00. Measures of intelligibility, a hearing screener, and the syllable repetition task are available free on the internet. You should have a standardized measure of articulation and/or phonology. I like the DEAP because it is comprehensive with good diagnostic properties; it costs about $600 with test forms. You can measure expressive language abilities informally although it is time consuming to do so. For younger children the QUILS receptive language screener is only $100. Generally standardized tests are in the $300 to $600 range unfortunately.

You might question the value of the optional tests, especially the K-BIT. However, I strongly recommend having the Kaufman Brief Intelligence Test because it includes a receptive vocabulary test as the verbal IQ screen and a nonverbal IQ screen and often you need a little bit of extra information to justify referring children to a psychologist. I have struggled to achieve progress with quite a few preschoolers in my practice who turned out to have very significant but undiscovered cognitive delays. The K-BIT is sold by Pearson for about $500. For children younger than four a play skills assessment can be a good substitute.

So, not worrying about exchange rates and rounding around the edges, you can count on spending $1500 on assessment materials in your first year. You should count on spending that much every year in order to update your editions and add tests in areas not covered by this stripped down list. After you add your provincial and federal association fees and your malpractice and liability insurance you are still not paying very much to start charging people for your services. The real costs come with actually conducting and then scoring the tests, and in phonology, analyzing the data. However, that is the competence that your clients are paying you for. More about that in the next post.

Expertise in SLP Private Practice

I have not been writing blog posts for some time. I miss it. I enjoyed spending cold weekends at my cabin, by the fire, researching a new topic, hunting down references, and putting my thoughts to paper. Even with typos and half-formed thoughts, those blogs felt like an accomplishment and so many people read them! To this day. But I gave it up because my position as school director took so much time (you will not believe what I do on Sundays now) and also I froze my right shoulder and it became painfully impossible to type for long periods, seven days a week. After three months of physiotherapy, started way too late, I can finally move my arm, and I have decided to write some blogs again. The physio is relevant because I have been thinking about private practice, both the excellent service the physio provides, my own practice given up 20 years ago, and all the young practitioners that I am talking to because they contact me for help. In the past only very experience SLPs started their own practices but now it is common for people right out of school to end up in private practice of one form or another. Since I don’t have time to research weighty topics, I will set down what will be essentially rants about the right and wrong way to do this because I am seeing some stuff that curls my hair frankly. When the mood hits me, a different aspect of setting up a practice, starting with the most important resource that you need to put into your practice: expertise. What is expertise? Why is it so important in private practice? How do you gain expertise in private practice?

What is expertise?

In Canada clinical training now reflects a competency-based approach. The Canadian framework adopts a “daisy model” in which numerous professional roles are arranged around the core “expert” role. The daisy organizes the different competencies that are required for entry-to-practice with the expectation that competencies will improve and deepen with further practice. The expert role is defined over many pages of specific competences but boils down to this: apply knowledge of development and disorders of communication together with assessment and intervention skills to clinical practice. Another definition of expertise that I really like is: “apply skills and knowledge of the discipline to make decisions with limited information in relevant contexts.”

Why is expertise so important in private practice?

Expertise is always essential but especially so in private practice: in sole-practice settings, the SLP is completely responsible for her own decisions. In larger public organizations there is an infrastructure to support decision making. In schools for example, who gets service and what kind and how much might be completely determined by regulation. The type of service provided might be a matter of tradition or culture, rightly or wrongly. Certainly, you will have colleagues that you can consult when you need help. Your employer will send you to conferences and provide professional development. You don’t have that support structure in private practice. Your client will expect you to justify your decisions directly and you will need to do that on the basis of expertise and that means knowledge. On average, your clients will be richer, more educated, and will consider themselves to be more discerning about their needs. For sure, they will have a choice about whether to accept or abandon your service. I have walked out on physiotherapists in the past because they told me to do what they always do; lacking a rationale, I had no faith in their practice. I am willing to pay a lot of money to the current practitioner because I can tell that he has a coherent theory of practice and that he is using deep knowledge to guide his treatment plan and solve problems. I am motivated to engage in an hour of painful exercises a day by visible evidence of progress and confidence in the therapist’s competence. Even children will find tangible evidence of learning and positive assurance from their SLP more motivating than stickers. You have a professional duty to demonstrate competence. This means a coherent theory of therapeutic change, deep knowledge of development and disorder(s), and a clear link between that knowledge and your practice.

How do you gain expertise in private practice?

Naïve models of expertise assume that it accumulates over time as you acquire “more” of something. Recall that the definition is essentially apply knowledge and skills to clinical practice. It has been assumed that teachers with more years of education will be more competent than those with less as a consequence of greater knowledge but this has not generally been supported by research. It has been assumed that psychologists with more years of practice will achieve better results than those with fewer as a consequence of better skills but research dispels that notion as well. Private SLPs like to acquire more skills through the accumulation of certificates attesting to their ability to apply new techniques or programs. However, the acquisition of more knowledge or skills or even practice does not add up to expertise unless the result is greater ability to solve problems. For this reason, practice with feedback is key. Feedback, that is your client’s response to therapy, provides invaluable information about the efficacy of your practice (feedback from mentors helps but feedback from your clients is more important and pertinent). SLPs do not always use client information as a clue to the efficacy of their practice. Often the client’s progress or lack of it is attributed to factors outside of the SLP’s control which is unfortunate. There is no opportunity for learning and no chance of deepening expertise if the client’s progress is not taken as important evidence of your own competence.

What if you observe that your client is not progressing? I often present data in my conference talks from a child who made literally no progress over a three year period. He was receiving lots of intervention from public therapists and the same private therapist over that time. But there seemed to be no recognition of his flat trajectory. It was astounding to see it in our research data. How is it possible? In fact, I have seen this happen quite a bit in my research. It seems that we have a tendency to practice on the basis of past experience while by-passing both knowledge and feedback, like this:

Client A: Symptom A: Apply TxA: Achieve success

Client B: Symptom A: Apply TxA: Achieve no success: Repeat

In this case, the treatment plan for Client B is built on the basis of experience (practice) with a previous client without integrating past knowledge or current feedback. The necessary knowledge relates symptoms to diagnoses, like this:

Client A: Symptom A+B: Diagnosis A: Apply TxA: Achieve success

Client B: Symptom A+C: Diagnosis B: Apply TxB: Achieve success

I am pretty sure that SLPs are failing to apply knowledge to problem solving quite often because when they ask for help, they do  not have all the information they need to form a diagnosis and when I refer to my book, the one I taught them with, they don’t have it! How can you apply knowledge to solve a problem if you don’t have your knowledge with you? And how can you solve a problem if you have not analyzed all the parameters of the problem? This brings me to the last point I want to make about expertise. Knowledge and applying it appropriately is an essential part of the picture. It is hard to know about everything (I have forgotten more about syntax than I ever new, its shameful really). So, private SLPs should specialize. Start with one thing you know a lot about and then make sure you know more about that every day. You can add another specialty area after awhile if you get really knowledgeable and well practiced with the first one. Do not pretend to be an expert in everything. If you advertise your services in everything your colleagues in the milieu will know that you are not a serious member of the private practice community and you will not be serving your clientele properly. If a person calls and asks you to provide a service you are not competent to provide, refer them to a person who is competent to provide it. If no-one is and you are their best shot, tell them you are not too sure about what to do and you will need to consult experts along the way. Never pretend to be an expert when you are not. It is unprofessional and embarrassing. Fortunately, it is not that hard to become an expert in most parts of SLP once having obtained your degree. Basically, read stuff, lots of stuff including basic science and practice guidelines. Be careful what you read because some stuff is not credible. Read real stuff by real experts. Apply what you read and pay attention to the results.

OME and Speech Therapy

A new paper has been published (Brennan-Jones et al, JSLHR, 2020) that examines the relationship between the outcome of a single tympanostomy assessment at age 6 with PPVT test scores at age 6 and 10 and CELF scores at age 10. My doctoral thesis was on the topic of otitis media with effusion (OME) and I noticed that the authors curiously omitted the most important large sample prospective study from consideration in their introduction and their discussion. The omission seems to have been strategic. The authors were motivated to compare their own study to others that had been flawed by ascertainment bias. However, there are other excellent studies that used a prospective design with good quality sampling procedures. Furthermore, these other studies have the advantage of multiple assessments of middle ear function at an age that is of particular relevance to language development. It is instructive to consider the findings of the literature as a whole when attempting to draw conclusions about the clinical implications of Brennan-Jones et al findings. The study cannot stand alone. For this reason, I offer my own commentary.

  1. OME is Normal

The first and most important fact to understand about otitis media with effusion is that it is normal. Because it is a “silent” condition the fluid in the child’s middle ear can remain unnoticed for 30 or more days. Even worse, common treatments such as antibiotics or decongestants are quite useless when it comes to clearing up the fluid. Although infection is dangerous to the child’s health it is the fluid that impairs hearing and it is the fluid that is hardest to cure. So children can spend a lot of their life with suboptimal hearing. That study that Brennan-Jones et al ignored? It involved frequent prospective monitoring of middle-ear status in 2253 infants, from 61 days until 2 years of age (the Pittsburgh study by Paradise et al). The proportion of infants who were observed to have middle ear effusion more than once was 48%, 79% and 91% at ages 6, 12, and 24 months. On average these infants spent about 20% of their life with fluid in one or both ears. A similar study conducted in Boston (Teele, Klein & Rosner) followed children from birth to age 3 and recorded a range of 0 to 500 days with middle ear effusion and an average of 116 days. Half the sample experienced more than 90 days with middle ear effusion and almost half the sample had a bout of OME during their first year. To summarize, nearly every child gets at least one ear infection but the range of days with middle ear effusion varies greatly from child to child.

  1. OME Causes Significant Hearing Loss

Almost every paper that discusses the conductive hearing loss that is associated with OME describes it as “mild” because most children achieve pure tone average thresholds of 20 to 25 decibels during an episode of OME and only 10% suffer losses of greater than 40dB (see Roberts et al for review). However the amount of hearing loss changes greatly during each episode and greatly across the population of children who have OME. Furthermore, children require a much greater signal-to-noise ratio to achieve the same perceptual performance as an adult when identifying and discriminating speech signals. The same level of hearing loss that is mild for an adult with normal language abilities is significant for an infant or young child that is engaged with the task of learning his or her first language.

  1. OME is Associated with Variations in Language Development

Both the Pittsburgh study (2253 infants monitored prospectively from birth) and the Boston study (205 infants followed prospectively from birth) found that amount of time with middle ear effusion was correlated with language development. I am reproducing some of the date below, grouped according to days with MEE ascending down the rows and SES categories ascending across the columns. In both studies, MEE and SES are significant predictors of vocabulary knowledge. In the Pittsburgh study the vocabulary measure was parent report of productive vocabulary on the McArthur Communicative Development Inventory when the child was 24 months old. In the Boston sample, the measure of receptive vocabulary was the Peabody Picture Vocabulary Test.

  Pittsburgh Sample (Expressive Vocabulary)
  Low SES Mid SES High SES
Least MEE








Most MEE




  Boston Sample (Receptive Vocabulary)
  Low SES Mid SES High SES
Least MEE








Most MEE




How do we interpret these data? The first thing to notice is that variation in vocabulary size is normal (Fenson et al., 2000). At 24 months a child might produce no words or over 400. What accounts for this broad variation? It is common to call on genetic explanations but environmental inputs play a large role in vocabulary development specifically and SES and OME are both environmental variables. The point here is that OME does not cause language delay but it is one variable that helps to explain the large variation in early vocabulary development within the normal range.

  1. What are the clinical implications of the research on OME?

It is a rather common tactic to conclude that the research data indicating a correlation between OME and slightly slower growth in some aspect of language development (as reported in Brennan-Jones et al between age 6 and 10 years for example) is of no particular clinical interest. The reason for this conclusion is that the impact of OME is taken to be “small” because the mean test scores are all within the normal range. In other words, OME does not cause language impairment and therefore “no clinical implications.”

Let’s think about this from the perspective of an SLP treating one particular patient. I have in mind the most common type of patient treated by the pediatric SLP in the world (I can predict this from survey data and large scale caseload studies): a child aged somewhere between 4 and 7 with a mixed speech sound disorder and expressive language delay. We can expect an underlying impairment with phonological processing that has a heritable genetic cause (Bishop et al, 2008). The most important protective factor (Rvachew & Grawburg, 2006) will be the child’s vocabulary size—something that is highly malleable. If the child receives sufficient high-quality inputs, it will be a lot easier to bring phonological processing skills into the expected range and ensure acquisition of literacy skills. If the child has chronic OME, you don’t really care whether the OME has caused the child’s speech and language skills or not. Even though I would still argue that there is reason to be concerned about permanent effects of OME during the first year on the development of the auditory system, you can let the scientists worry about that. The issue is that this child cannot afford to lose a single word of language input. Because right now, intense high-quality language input is all we have in our treatment tool box. Let’s make sure that each child on our caseload can hear the precious minutes of therapy input that we are providing. And when we send them back to their noisy homes and classrooms with their homework books, let’s make sure they can participate in those activities to their maximum benefit. Hearing impairment affects everybody. And this child in particular doesn’t have any days to lose.



Speech Therapy and Speech Motor Control: Part 3

In two previous blogs I discussed a recent paper by Strand in which she outlines in detail the theoretical foundation and procedural details of Dynamic Temporal and Tactile Cueing (DTTC) as a treatment for Childhood Apraxia of Speech (CAS). In Part 1 I suggested that the theoretical base, being Schmidt’s “Schema Theory of Discrete Motor Skill Learning,” was outdated. In Part 2 I discussed modern theories of speech motor control that assume a dynamic interplay of feedforward and feedback control mechanisms. In this blog I will discuss the implications for speech therapy, in relation to critical aspects of DTTC.

First, let us consider the core element of DTTC, “the focus on the movement (rather than the sound or phoneme) in terms of modeling, cueing, feedback, and target selection” (p. 4). I believe that all of us who strive to help children with CAS acquire intelligible speech agree that speech movements are the focus of speech therapy, as opposed to phonological contrasts. Nonetheless, this statement raises questions about the nature of “speech movements.” What is the goal of a speech movement? The answer to this question is controversial: it may be a somatosensory target involving specific articulators, such as for example bring the margins of the tongue blade into contact with the upper first molars; or it may be to produce a particular vocal tract shape such as a large back cavity separated from a small front cavity by a narrow constriction; or it may be to produce an acoustic output that will be perceived as the vowel [i]. The DTTC is structured to promote precise and consistent movements of the articulators and therefore the first scenario is presumed. Furthermore, the origin of CAS is hypothesized to be a deficit in proprioceptive processing that arises from an impairment in cerebellar mechanisms. Updating the theory, this hypothesis would implicate feedforward control which, following from Guenther and Vladosich (2012), “projects directly from the speech sound map [in left ventral premotor cortex and posterior Broca’s area] to articulatory control units in cerebellum and primary motor cortex” (p. 2). However, new research (Liégeois et al., 2019) identifies the locus of structural and functional impairments underlying CAS as being along a dorsal pathway of cortical structures, specifically: reduced white matter and fMRI activations in sensory motor cortex and along the arcuate fasciculus and reduced grey matter and fMRI activations in superior temporal gyrus and angular gyrus. They explain that “this route links auditory input/representation to articulatory systems … and transforms phonological representations into motor programs …In contrast, the speech execution white matter pathway (corticobulbar) and the ventral language route (IFOF) were not altered in this family” [that showed multigenerational impairments in speech praxis]. My point is that although the cerebellum is important to speech motor control and CAS may well involve impairments in proprioceptive feedback, speech is clearly a sensory motor skill that requires close connection among articulatory and auditory representations for sounds and syllables.

In Part 2 of this blog series I indicated that adults can compensate for unexpected perturbations to articulatory trajectories or auditory feedback very rapidly by drawing on their internal model of vocal tract function. It is interesting to consider that throughout speech development children cope with perturbations to articulatory gestures and expected acoustic outputs because their vocal tract is changing shape, sometimes quite dramatically, throughout childhood. Callen et al. (2000) showed how the developing child can adapt to the changing vocal tract by aiming for relatively stable auditory targets (conceived of as regions in auditory space) and using auditory feedback and simulations of auditory outputs to achieve those targets even as vocal tract structure is changing. The key to this remarkable ability is a learned mapping between articulator movements, vocal tract shapes and auditory outputs. The learning and updating of this internal model of vocal tract function arises from an unsupervised learning mechanism, essentially Hebbian learning: young infants engage in a great deal of unstructured vocal play as well as somewhat more structured babbling – speech practice that allows them to learn the necessary correspondences without having specific speech goals. Infants with CAS are widely believed to skip this period of speech development; therefore, it is likely they begin speech therapy without an internal model of vocal tract function which is foundational for goal directed speech practice. Therefore, precise, repeated, consistent speech movements may not be the best place to start a treatment program for severe CAS; a program of unstructured vocal play that targets highly varied playful vocalizations is a better starting place for many children. Subsequently, high intensity practice with babble (repetitive syllable production) will stabilize the mappings between articulatory gestures and the resulting vocal tract configurations and somatosensory and auditory outcomes.

One of the advantages of a well-tuned internal model of vocal tract function is that it supports “motor-equivalent speech production” given commonly occurring constraints on speech production. In other words, there are many different articulatory gestures that will produce the same acoustic-phonetic goal. When the child has a stable acoustic-phonetic target and is able to process auditory feedback in relation to that target, various articulatory solutions can be found to adapt to changing vocal tract structure or constraints such as talking while eating or a holding a pen between the teeth. Developmental changes in the way that articulators are coordinated to produce the same phoneme are well documented in the literature. Similarly speech production varies with phonetic context. Motor equivalent trading relations between tongue body height and lip rounding are well known for production of the vowel [u] and the consonant [ʃ] for example and the front-back positioning of the constriction in these phonemes is highly variable across speakers and phonetic contexts. The precision with which these phonemes are produced is related to the talker’s perceptual acuity: for example, adults who have sharp perceptual boundaries between [ʃ] and [s] produce them with greater articulatory consistency as well as greater acoustic contrast between the phoneme categories. Perkell et al. (2004) speculated “In learning to maximize intelligibility, the child with higher acuity is better able to reject poor exemplars of each phoneme (as in the DIVA model), and thus will adopt sensory goals for producing those phonemes that are further apart than the child with lower acuity.” The implications for speech therapy are that, even in the case of CAS, ensuring stable acoustic-phonetic targets for speech therapy goals is essential whereas insisting upon SLP defined articulatory parameters may be counter-productive. The goal is not absolute  consistency in the production of specific motor movements, but rather, dynamic stability in the achievement of speaking goals.

Although it is speculated that feedforward control is weighted more heavily than feedback control in adult speech, feedback is critical to speech learning during infancy and childhood. Furthermore, auditory feedback plays a crucial role. The initial goal is an auditory target. Guenther and Vladusich (2012) explain that “the auditory feedback control subsystem [helps to] shape the ongoing attempt to produce the sound by transforming auditory errors into corrective motor commands via the feedback control map in right ventral premotor cortex” (p. 2). They further explain that repeated practice of this type eventually leads to the development of somatosensory goal regions. A particular frustration for children with CAS is perseveration, the difficulty of changing a well-learned articulatory pattern to a new one that is more appropriate. This problem with perseveration highlights the need to engage the feedback control system. There are two strategies that are essential: first a high degree of variation in the practice materials which can be introduced by practicing nonsense syllables with a carefully graded increase in difficulty but variation in the combination of syllables within difficulty levels. The second strategy is to provide just the right amount of scaffolding along the integral stimulation hierarchy so that the child will be successful more often than not while experiencing a certain amount of error. Some error ensures that corrective motor commands will be generated from time to time. Imagine practicing syllables that combine four consonants [b, m, w, f] with four vowels [i], [u], [æ], [ɑ] and four diphthongs [ei], [ou], [ɑi], [au], [oi], presented at random so that the child imitates the first syllable (Say [bi]) and then repeats it again twice (Say it again… and again…), before proceeding to another syllable. You will have a great many targets in your session but created from a small number of elements. Imagine further that you progress to a more difficult level (reduplicated syllables, [bubu], [mimi]) as soon as the child achieves 80% correct production of the single syllables. You can see that you will also be allowing the child to produce quite a bit of error. We call this the challenge point. Tanya Matthews, Francoise Brosseau-Lapré and I are working on a paper to describe how to do this and describe our experiences with the approach. You will see that it is very different from working on five words and requiring that the child achieve 15 to 20 correct productions at the imitative word level before proceeding to delayed imitation and then again before proceeding to spontaneous productions. Errorless learning is a fundamental aspect of DTTC and has a long history in speech therapy practice. However it is not clear that it is well-motivated from the perspective of developmental science.

To summarize, there are many aspects of DTTC that are similar across all sensory-motor approaches to the treatment of CAS. In particular high intensity speech practice is well motivated and likely to be effective with all forms of moderate and severe speech sound disorder. Nonetheless there are some significant differences between Strand’s approach and the approach that I recommend based on an updated theory of speech motor control. There is still a great deal of research to do because very few of our specific speech therapy practices have received empirical validation even though speech therapy in general has been shown to be efficacious. As a guide to future research (hopefully using randomized and thus interpretable designs), I provide a table of procedures that are similar and different across the two theoretical approaches.




Treatment Procedures that are Similar

High intensity practice
Focus on speech movements (not phonemes)
Practice syllable sized units (not isolated sounds)
Attend to temporal aspects of trial structure (delayed imitation, delayed provision of feedback)
Integral stimulation hierarchy (attend to visual and auditory aspects of target)

Treatment Procedures that are Different

Focus on precise, consistent movements Focus on dynamic stability
Over-practice: accuracy over 10-20 trials Variable practice when possible
Errorless learning Challenge point: 4/5 correct, then move up
Behavioral shaping of accurate movements Motor equivalent movements
Tactile and gestural cues to ensure accuracy Sharpen knowledge of auditory target
“Hold” initial configurations Encourage vocal play, develop internal model


Callan, D. E., Kent, R. D., Guenther, F. H., & Vorperian, H. K. (2000). An auditory-feedback-based neural network model of speech production that is robust to developmental changes in the size and shape of the articulatory system. Journal of Speech, Language, and Hearing Research, 43, 721-738.

Guenther, F. H., & Vladusich, T. (2012). A neural theory of speech acquisition and production. Journal of Neurolinguistics, 25(5), 408-422.

Liégeois, F. J., Turner, S. J., Mayes, A., Bonthrone, A. F., Boys, A., Smith, L., . . . Morgan, A. T. (2019). Dorsal language stream anomalies in an inherited speech disorder. Brain, 142(4), 966-977.

Perkell, J., Matthies, M., Lane, H., Guenther, F. H., Wilhelms-Tricarico, R., Wozniak, J., & Guiod, P. (1997). Speech motor control: Acoustic goals, saturation effects, auditory feedback and internal models. Speech Communication, 22, 227-250.

Perkell, J., Matthies, M. L., Tiede, M., Lane, H., Zandipour, M., Marrone, M., . . . Guenther, F. H. (2004). The distinctness of speakers’ /s/-/ʃ/ contrast is related to their auditory discrimination and use of an articulatory saturation effect. Journal of Speech, Language, and Hearing Research, 47, 1259-1269.

Rvachew, S., & Matthews, T. (2017). Demonstrating treatment efficacy using the single subject randomization design: A tutorial and demonstration. Journal of Communication Disorders, 67, 1-13.

Rvachew, S., & Matthews, T. (2019). An N-of-1 Randomized Controlled Trial of Interventions for Children With Inconsistent Speech Sound Errors. Journal of Speech, Language, and Hearing Research, 62, 3183–3203

Speech Therapy and Speech Motor Control: Part 2

Speech Therapy and Theories of Speech Motor Control: Part 2

In Part 1 of this blog series I described the theoretical basis of Dynamic Temporal and Tactile Cueing as recently published by Edy Strand. Specifically, the treatment is founded on Schmidt’s Schema Theory in which generalized motor programs are learned. During speech production the child must select the right program and apply the correct parameters before implementing it all at once. If the parameters are selected incorrectly, a speech error will occur. It is rather like making toast. If you forget to reset your settings after toasting bagels, your Wonderbread will come out black! The problem as stated by Schmidt is that by the time you realize that your toast settings are wrong and your motor gestures are off track, it’s too late— the toast is burned and you have said “Trat! Doast!” Learning occurs by “trial and error” — after much experience with your toaster you learn the settings (parameters) for getting the right amount of toastiness for different items. Learning to operate your toaster is similar to acquiring one “generalized motor program.” Speech motor learning is assumed to operate this way because sensory feedback is too slow to support on-line adjustments to the parameters in a direct way. I used a different analogy in the previous blog — once you have committed to swinging your golf club, you tend to follow through.

The problem with this model of speech motor control is that we know for certain that real time modification of vocal tract movements occurs in response to somatosensory and auditory feedback. Strangely we have known since the early eighties that the speech system is highly sensitive to error on-line; therefore, I don’t know why this idea of open-loop control persists. The proof comes from studies in which (typically) an adult is asked to repeatedly produce a particular syllable or disyllable and then experiences a perturbation in sensory feedback (either somatosensory feedback or auditory feedback). An early example of this paradigm involved productions of “aba”: during 15% of trials a mechanism placed an unexpected load on the talker’s lower lip. Here is where it gets interesting: the research participants corrected for this perturbation in the articulatory trajectory of the bottom lip very rapidly with compensatory actions of the top and the bottom lip (the bottom lip would need to exert greater upward force and the top lip would need to produce greater downward extent in order to produce the labial closure and the expected transitions into and out of the consonantal closure). Decades of experiments have followed involving many other perturbations in the domain of articulatory gestures, somatosensory (skin) sensations, and auditory feedback. For example, while the research participants are repeatedly saying “bed” you can trick their ear into thinking they are saying “bad” which leads to compensatory adjustments in articulation to get the expected auditory percept.

This kind of dynamic compensation across the entire vocal tract is made possible by an “internal model” — a neural model that simulates the behavior of a sensorimotor system in relation to its environment. The internal model can generate a prediction of the sensory consequences of implementing a motor plan via simulation. For speech, future outputs in the somatosensory and auditory domains are simulated; furthermore, the simulator takes into account delayed sensory feedback, noise in the perceptual system and other variables so that when feedback arrives it can be compared with the prediction and provide reliable error messages. Continuous tracking of the vocal tract state is thus permitted and forms the basis for ongoing planning of movements as speech unfolds. If an unexpected event occurs, as in the perturbation experiments that I have described, error corrections are dynamic across the entire system; therefore, if the predicted trajectory of acoustic formant transitions from the [a] into the [b] closure is not occurring, lower lip, upper lip, jaw and tongue movements can all be harnessed to produce the desired outcome.

As Houde and Nagarajan (2011) explain, “speech motor control is not an example of pure feedback control or feedforward control” (p. 11). The acquisition of speech motor control is dependent upon the development of the internal model of vocal tract function as well as detailed knowledge of auditory targets. This understanding has implications for the treatment of childhood apraxia of speech. I will explore these implications further in the next and final blog in this series.


Abbs, J. H., & Gracco, V. L. (1983). Sensorimotor actions in the control of multi-movement speech gestures. Trends in Neurosciences, 6, 391-395.

Houde, J. F., & Jordan, M. I. (2002). Sensorimotor adaptation of speech I: Compensation and adaptation. Journal of Speech, Language & Hearing Research, 45(2), 295-310.

Houde, J. F., & Nagarajan, S. S. (2011). Speech production as state feedback control. Frontiers in Human Neuroscience, 5, doi: 10.3389/fnhum.2011.00082.

Tourville, J. A., Reilly, K. J., & Guenther, F. H. (2008). Neural mechanisms underlying auditory feedback control of speech. NeuroImage, 39, 1429-1443.

Speech Therapy and Theories of Speech Motor Control: Part I

Edy Strand recently published a detailed description of her Dynamic Temporal and Tactile Cueing treatment strategy. As she says this is a hugely valuable paper because it provides a complete description of a treatment designed for severe speech sound disorders, especially Childhood Apraxia of Speech, and more importantly, it summarizes in one place the theoretical foundation for the treatment. I think that, on the whole, this is an efficacious treatment although there are some procedures, derived directly from the outdated theoretical underpinnings, that are questionable however, and therefore I am going to devote several blogs to more recent theory and basic science research on the development of speech motor control and apraxia of speech. In this first blog, I review Schema Theory, even though this theory is just not right! But it has a long history and remains currently popular across almost all clinically-oriented papers on motor speech disorders.

The theory that is referenced in Edy Strand’s paper is Richard Schmidt’s “Schema Theory of Discrete Motor Skill Learning,” published in Psychological Review in 1975 and subsequently brought to speech-language pathology by Ray Kent and others as a useful framework for thinking about speech therapy. The important idea underlying this theory is that motor skills are made up of brief, discrete motor acts that are executed all-at-once as open-loop generalized motor programs, adapted with specific response specifications (called parameters) for the current conditions. The theory assumes “open-loop” control because sensory feedback is often too slow to impact movement after it has started. According to this theory feedback is processed after the movement is over and incorporated into the schema for the future execution of the generalized motor program. I have used golf as an example before; even though I haven’t played much in years let’s do it again: if we are adopting this theory we would think of practice sessions as developing different generalized motor programs for each type of shot, a long drive, a short 7-iron shot, the up-and-down pitch onto the green, and the putt into the hole. Which shot you choose depends upon your recall schema: what is your target and which type of shot is likely to achieve it? I personally recall that when close to the green my pitch is better than my chip (whereas my husband has the opposite preference). How you address the ball depends upon the initial conditions (flat ground, hill, tall grass etc.). The motor control parameters (also known as response specifications) depend upon the distance to the target (how high to lift the club, speed of follow through, force applied and so on). Based on the initial conditions and the desired outcome, I launch the shot with my wedge, expecting a certain “feel” as I hit the ball based on past experience with the sensory consequences of hitting this shot; I can always “recognize” a good hit even before I see the ball land (often I just turn my back on the ball, I don’t even want to see it land!). But in any case, the actual outcome is important for updating the “recall” schema; specifically, if I have actually achieved my target, I add all this information, the initial conditions, the response specifications, the recognition schema and the recall schema to my memory. The generalized motor program is an abstraction across all these remembered practice trials, permitting correct specification of the response parameters in future shots. Furthermore, I should be able to adapt the generalized motor program to similar shots, even if the ball is a little further or closer to the green for example.

When applied to CAS, in which current research suggests unreliable or degraded somatosensory feedback, the use of this model focuses attention on the child’s processing of initial conditions, inaccurate planning or programming of the movement due to poor selection of response specifications, and/or poor recognition schema (not knowing when the movement “feels right”). Therefore, certain procedures are recommended. DTTC providers use manual or gestural cues to shape the child’s articulators into the “initial position” and encourage the child to “hold” the position momentarily so as to fully process those initial conditions before launching the movement. During the initial stages of therapy, the SLP uses a slow rate and co-production so that the child is getting extra feedback during the practice trial, presumably with the goal of stabilizing the recognition schema. Imitative models support the child’s knowledge of the target which, when combined with copious knowledge of results feedback should support the development of recall schema. And finally, a great deal of practice with an errorless approach ensures that the child lays down many memory traces of correctly executed motor programs.

The recommendations that are provided make a certain amount of sense given the context of schema theory (even though there is in fact no evidence for the specific efficacy any one of these particular procedures). The problem is that it is not clear that schema theory is a reasonable foundation for modern speech therapy practice.

First, citing Richard Schmidt himself, he cautioned in 2003 that “schema theory was intended to be an account of discrete actions. Hence, continuous actions, such as steering a car or juggling, which are both of longer duration (allowing time for response-produced feedback to have a role) and more based on the performer’s interactions with the environment were outside the area for schema theory…long-duration actions might be based on interplay between open-loop subactions and feedback-based corrections… . Interestingly, tasks such as juggling seem appropriate for analysis in terms of the dynamical systems perspective” (p. 367). I would argue that our understanding of, not only juggling, but speech motor control has benefited immensely from the dynamical systems perspective and I will come back to that in the next blog. If juggling is considered too complex and continuous to be explained by schema theory, probably speech is not a good fit either.

Second, modern theories of speech motor control have shown that on-line correction of motor action even over short durations occurs despite the limitations of feedback control. The explanation lies in the continuous operation of feedforward control mechanisms. More on feedforward control in another blog.


Rvachew, S., & Brosseau-Lapré, F. (2012). Developmental Phonological Disorders: Foundations of Clinical Practice. San Diego, CA: Plural Publishing.

Schmidt, R. A. (1975). A schema theory of discrete motor skill learning. Psychological Review, 82(4), 225-260. doi:10.1037/h0076770

Schmidt, R. A. (2003). Motor schema theory after 27 years: Reflections and implications for a new theory. Research Quarterly for Exercise and Sport, 74(4), 366-375.

Strand Edythe, A. (2019, Early View). Dynamic Temporal and Tactile Cueing: A Treatment Strategy for Childhood Apraxia of Speech. American Journal of Speech-Language Pathology. doi:10.1044/2019_AJSLP-19-0005