Would you do speech therapy like this?

I was interested to read a paper about the relative efficacy of using traditional flash cards versus tablet presentation of pictures for articulation drill therapy because I have developed iPad apps myself (e.g., see www.DIALspeech.com) and have an interest in the potential of digital tools to enhance the speech therapy experience. The paper was recently published in the Online First section of Communication Disorders Quarterly by Krystel Werfel, Marren Brooks, and Lisa Fitton.

The study used a single subject alternating treatment design with four subjects, each kindergarten aged, —not clearly exhibiting signs of speech delay but none-the-less misarticulating two phonemes that could be practiced. Some statistical analyses (rather dubiously applied to single subject data) suggested that the children achieved mastery sooner in the flashcard condition but produced more correct responses in the tablet condition. To my eye, the data did not suggest a clear advantage to either condition. All the children did in fact master the treated phonemes (which were /z,s/, /pl,ɡl/, and /θ,ð/ (this pair for two children).

The authors make clear that the study is meant to be informative on the modality of stimulus presentation and not a test of the treatment protocol itself but I found myself alarmed at the possibility that readers might think that the treatment protocol would be reasonable in regular clinical practice and therefore I would like to address the way that the intervention was implemented. Often researchers implement a speech therapy intervention in a way that they would not in a regular clinical environment in an effort to exert more experimental control over all the variables than is typically necessary or desirable in an authentic clinical context. I can only hope that this explains some of the clinical choices that were made in this case. I am going to address several in turn as follows: (1) treatment approach; (2) treatment procedure; (3) reinforcement procedures; (4) cumulative intervention intensity; and (5) discharge criteria.

First, the authors state that they chose a traditional approach to therapy because there is empirical evidence that it works and clinicians prefer it. There is evidence of efficacy but in fact for most preschool aged children who qualify for speech services a phonological approach may be more efficacious as Francoise and I discuss in our text. Furthermore, the surveys indicating a preference for a traditional approach indicating that this is true in the United States but not elsewhere. Finally, there seems to be some confusion about what a “traditional” approach is. In some cases, traditional refers to a strict behaviorist intervention that focuses solely on speech production with a gradual increase in the complexity of speech units; in other cases it involves a sensory-motor approach with careful attention to variable speech practice and multiple targets; in other cases a traditional approach means Charles Van Riper’s approach which was properly sensory motor including both ear training, graduated speech practice and some principles of motor learning. The implementation in this paper was highly restricted involving only practice of single words and sometimes isolated sounds if necessary. If the speech therapist chooses a traditional rather than phonological approach it is best that the full sensory motor protocol be implemented.

Second, the drill based approach that was employed was selected again on empirical grounds. The study cited to support this approach was sound especially when treating children who have good speech perception abilities, most likely the case for the children in this study who did not have clear evidence of a speech disorder. Other approaches can be effective if procedures targeting phonological processing are incorporated into the intervention as shown by Hesketh and colleagues in the U.K. and also by me and Francoise with French-speaking children.

The strangest part of the whole intervention is that the children experienced over 25 treatment sessions each and throughout every session identical practice trials occurred: a stimulus prompt was presented, the child attempted to name the picture, the clinician provided feedback or extra support and then if the child’s response was correct he or she was permitted to mail the flash card or swipe the picture of the tablet. That was it. For eight weeks. I’m speechless. Enough said.

Regarding cumulative intervention intensity, I indicated in previous blogs that children should receive a minimum of 50 practice trials and ideally 100 practice trials per session. Furthermore, other single subject research using a minimal pairs procedures indicates that generalization goals are not usually met with fewer than 180 practice trials (when treating children with moderate or severe phonological delays). In Werfel’s study the children received treatment for two sounds in 20 minutes, so ten minutes per sound and 15 practice trials per sound or 10-minute block, therefore 30 practice trials per 20-minute treatment session. Reportedly, the mastery was achieved after 203 trials in the flashcard condition and 270 trials in the tablet condition (equivalent to 135 and 180 minutes of therapy respectively). However, increasing the number of practice trials to 50 during that 20-minute session could reduce the number of sessions or weeks in the intervention program by almost half. One way to do that would be to reduce the amount of feedback that was provided. The intervention was designed so that the clinician provided explicit feedback to the child after every practice attempt whereas the principles of motor learning suggest that less feedback is often better for speech motor learning. For example, a child can name five pictures in a row and be told that four of the five productions were correct. Another strategy is to practice at the challenge point at all times as described in detail by Francoise and I in Developmental Phonological Disorders: Foundations of Clinical Practice but also in our new undergraduate text Introduction to Speech Sound Disorders.

Finally, the discharge or stopping criteria in the study were set at 100% correct performance on the generalization probe over 3 consecutive sessions. The probe contained 5 treated words and 5 untreated words. This criterion meant that children practiced their targets for a long time past the point at which the practice material should have been made more difficult or the child should have been discharged to see if spontaneous generalization to natural speaking situations would occur. As Francoise and I review in Chapter 8 of our book, several studies have shown that children can be discharged after achieving between 40 and 80% correct responding on generalization probes. Most children will continue to make gains in production accuracy after this point. The four children in the Werfel et al study received an average of 5 unnecessary treatment sessions according to these criteria.

When conducting treatment studies, it is helpful to provide models of treatment procedures that are best practice in the clinical setting. Often interventions that are better than no intervention will prove to be effective in a research setting while not necessarily being best practice. These studies are confusing for a clinical audience I think. Furthermore, when asking clinical questions about new technologies it is interesting to ask, why would we want to bring it into our clinical practice? What benefit might it bring? How can we adapt these technologies so that the best of human interactions are retained and the most benefit of the technology is added? In my next blog I will address the Werfel study again, but this time imagining the questions we might ask about tablet-based implementations of articulation therapy.

Advertisements

Conversations with SLPs: Nonword Practice Stimuli

I often answer queries from speech-language pathologists about their patients or more abstract matters of theory or clinical practice and sometimes the conversations are general enough to turn into blog topics. On this occasion I was asked my opinion about a specific paper with the question being generally about the credibility of the results and applicability of the findings to clinical practice:

Gierut, J., Morrisette, M. L., & Ziemer, S. M. (2010). Nonwords and generalization in children with phonological disorders. American Journal of Speech-Language Pathology, 19, 167-177.

In this paper the authors conduct a retrospective review of post treatment results obtained from 60 children with a moderate-to-severe phonological delay who had been treated in the context of research projects gathered under the umbrella of the “learnability project”. Half of these children had been taught nonwords and the remainder real words, representing phonemes for which the children demonstrated no productive phonological knowledge. The words (both the nonword targets and the real word targets) were taught in association with pictured referents, first in imitation and then in spontaneous production tasks. Generalization to real word targets was probed post-treatment. Note that the phonemes probed included those that were treated and any others that the child did not produce accurately at baseline. The results show an advantage to treated over untreated phonemes that is maintained over a 55 day follow-up interval. Greater generalization was observed for children who received treatment for nonwords compared to those children who received treatment for real words, but only for treated phonemes and only immediately post treatment because over time the children who received treatment for real words caught up to the other group.

OK, so what do I think about this paper. Overall, I think that it provides evidence that it is not harmful to use nonwords in treatment which is a really nice result for researchers. As Gierut et al explain, nonwords are handy because “they have been incorporated into research as a way of ensuring experimental control within and across children and studies.” They can be designed to target the specific phonological strengths and needs of each child and it is very unlikely that the family or school personnel will practice them outside of clinic and therefore it is possible to conclude that change is due to the experimental manipulation. Gierut et al go one step further however and conclude that nonword stimuli might offer an advantage for generalization learning because “the newness of the treated items might reduce interference from known words.” Here I think that the evidence is weaker simply because this is a nonexperimental study. The retrospective nature of the study and the fact that children were not assigned with blind random assignment in one cohort to be taught with one set of stimuli vs the other while holding other aspects of the design constant limits the conclusions that one can draw. For example, the authors point out that the children who were treated with nonwords received more treatment sessions than those treated with real words. Therefore, in terms of clinical implications, the study does not offer much guidance to the SLP beyond suggesting that there may be no harm in using nonword stimuli if the SLP has specific reasons for doing so.

We can offer experimental prospective evidence on this topic from my lab however. It is also limited in that it involves only two children but they were both treated with a single subject randomization design that provides excellent internal validity. This study was conducted by my former student Dr. Tanya Matthews with support from Marla Folden, M.Sc., S-LP(C). The interventions were provided by McGill students in speech-language pathology who were completing their final internship. The two children presented with very different profiles: TASC02 had childhood apraxia of speech with an accompanying cognitive delay and ADHD. TASC33 presented with a mild articulation delay and verbal and  nonverbal IQ within normal limits.

Both children were treated according to the same protocol: they received 18 treatment sessions, provided 3 per week for 6 weeks. Each week they experience three different treatment conditions, randomly assigned to one of the 3 sessions and a unique target as shown in the table below for the two children. Each session consisted of a preprepractice portion and a practice portion. The prepractice was either Mixed Procedures (auditory bombardment, error detection tasks, phonetic placement, segmentation and chaining of segments with the words) or Control (no prepractice). In all three conditions practice was high intensity practice employing principles of motor learning.

realword vs nonword conditions

Random assignment of condition/target pairs to sessions within weeks permits the use of resampling tests to determine if there are statistically significant differences in outcomes as a function of treatment condition. Outcomes were assessed via imitation probes that were administered at the end of each treatment session to measure generalization to untreated items (same day probes) and probes that were administered approximately 2 days later (at the beginning of the next treatment session) to measure maintenance of those learning gains (next day probes). The next table shows the mean probe scores by condition and child, the test statistic (squared mean differences across conditions) and the associated p value for the treatment effect for each child.

realword vs nonword outcomes

The data shown in this table reveal no significant results for either child for same day or next day probe scores. In other words there was no advantage to the prepractice versus no prepractice condition and there was no advantage to nonword practice over real word practice.

We hope to publish some data soon that suggests that the specific type of prepractice might make a difference for certain children. But overall the most important driver of outcomes for children with speech sound disorders seems to be practice and lots of it.

Is Acoustic Feedback Effective for Remediating “r” Errors?

I am very pleased to see a third paper published in the speech-language pathology literature using the single-subject randomization design that I have described in two tutorials, the first in 1988 and the second more recently. Tara McAllister Byun used the design to investigate the effectiveness of acoustic biofeedback treatment to remediate persistent “r” errors in 7 children aged 9 to 15 years. She used the single subject randomized alternation design with block randomization, including a few unique elements in her implementation of the design. She and her research team provided one traditional treatment session and one biofeedback treatment session each week for ten weeks. However the order of the traditional and biofeedback sessions was randomized each week. Interestingly, each session targeted the same items (i.e., “r” was the speech sound target  in both treatment conditions): rhotic vowels were tackled first and consonantal “r” was introduced later, in a variety of phonetic contexts. (This procedure is a variance from my experience in which, for example, Tanya Matthews and I randomly assign different targets to different treatment conditions). Another innovation is the outcome measure: a probe constructed of untreated “r” words was given at the beginning and end of each session so that change (Mdif) over the session was the outcome measure submitted to statistical analysis (our tutorial explains that the advantage of the SSRD is that a nonparametric randomization test can be used to assess the outcome of the study, yielding a p value).  In addition, 3 baseline probes and 3 maintenance probes were collected so that an effect size for overall improvement could be calculated. In this way there are actually 3 time scales for measuring change in this study: (1) change from baseline to maintenance probes; (2) change from baseline to treatment performance as reflected in the probes obtained at the beginning of each session and plotted over time; and (3) change over a session, reflected in the probes given at the beginning and the end of each session. Furthermore, it is possible to compare differences in within session change for sessions provided with and without acoustic feedback.

I was really happy to see the implementation of the design but it is fair to say that the results were a dog’s breakfast, as summarized below:

Byun 2017 acoustic biofeedback

The table indicates that two participants (Piper, Clara) showed an effect of biofeedback treatment and generalization learning. Both showed rapid change in accuracy overall after treatment was introduced in both conditions and maintained at least some of that improvement after treatment was withdrawn. Garrat and Ian showed identical trajectories in the traditional and biofeedback conditions with a late rise in accuracy during treatment session, large within session improvements during the latter part of the treatment period, and good maintenance of those gains. Neither boy achieved 60% correct responding however at any point in the treatment program. Felix, Lucas and Evan demonstrated no change in probe scores across the twenty weeks of the experiment in both conditions. Lucas started at a higher level and therefore his probe performance is more variable: because he actually showed a within session decline during traditional sessions while showing stable performance within biofeedback sessions, the statistics indicate a treatment effect in favour of acoustic biofeedback but in fact no actual gains are observed.

So, this is a long description of the results that brings me to two conclusions: (1) the alternation design was the wrong choice for the hypothesis in these experiments; and (2) biofeedback was not effective for these children; even in those cases where it looks like there was an effect, the children were responsive to both biofeedback and the traditional intervention.

In a previous blog, I described the alternation design; there is another version of the single subject randomization design that would be more appropriate for Tara’s hypothesis however.  The thing about acoustic biofeedback is that it is not fundamentally different from traditional speech therapy, involving a similar sequence of events: (i) SLP says a word as an imitative model; (ii) child imitates the word; (iii) SLP provides informative or corrective feedback. In the case of incorrect responses in the traditional condition in Byun’s study, the SLP provided information about articulatory placement and reminded the child that the target involved certain articulatory movements (“make the back part of your tongue go back”). In the case of incorrect responses in the acoustic biofeedback condition, the SLP made reference to the acoustic spectrogram when providing feedback and reminded the child that the target involved certain formant movements (“make the third bump move over”). Firstly, the first two steps are completely overlapping in both conditions and secondly it can be expected that the articulatory cues given in the traditional condition will be remembered and their effects will carry-over into the biofeedback sessions. Therefore we can consider the acoustic biofeedback to be an add-on to traditional therapy. We want to know about the value added. Therefore the phase design is more appropriate: in this case, there would be 20 sessions (2 per week over 10 weeks as in Byun’s study), each session would be planned with the same format: beginning probe (optional), 100 practice trials with feedback, ending probe. The difference is that the starting point for the introduction of acoustic biofeedback would be selected at random. All the sessions that precede the randomly selected start point would be conducted with traditional feedback and all the remainder would be conducted with acoustic biofeedback. The first three would be designated as traditional and the last 3 would be designated as biofeedback for a 26 session protocol as described by Byun. Across the 7 children this would end up looking like a multiple baseline design except that (1) the duration of the baseline phase would be determined by random selection for each child; and (2) the baseline phase is actually the traditional treatment with the experimental phase testing the value added benefit of biofeedback. There are three possible categories of outcomes: no change after introduction of the biofeedback, an immediate change, or a late change. As with any single subject design, the change might be in level, trend or variance and the test statistic can be designed to capture any of those types of changes. The statistical analysis asks whether the obtained test statistic is bigger than all possible results given all of the possible random selection of starting points. Rvachew & Matthews (2016) provides a more complete  explanation of the statistical analysis.

I show below an imaginary result for Clara, using the data presented for her in Byun’s paper, as if the traditional treatment came first and then the biofeedback intervention. If we pretend that the randomly selected start point for the biofeedback intervention occurred exactly in the middle of the treatment period, the test statistic is the difference of the M(bf) and the M(trad) scores resulting in -2.308. All other possible random selections of starting points for intervention lead to 19 other possible mean differences, and 18 of them are bigger than the obtained test statistic leading to a p value of 18/20 = .9. In this data set the probe scores are actually bigger in the earlier part of the intervention when the traditional treatment is used and they do not get bigger when the biofeedback is introduced. These are the beginning probe scores obtained by Clara but Byun obtained a significant result in favour of biofeedback by block randomization and by examining change across each session. However, I am not completely sure that the improvements from beginning to ending probes are a positive sign—this result might reflect a failure to maintain gains from the previous session in one or the other condition.

Hypothetical Clara in SSR Phase Design

There are several reasons to think that both interventions that were used in Byun’s study might result in unsatisfactory generalization and maintenance. We discuss the principles of generalization in relation to theories of motor learning in Developmental Phonological Disorders: Foundations of Clinical Practice. One important principle is that the child needs a well-established representation of the acoustic-phonetic target. All seven of the children in Byun’s study had poor auditory processing skills but no part of the treatment program addressed phonological processing, phonological knowledge or acoustic phonetic representations. Second, it is essential to have the tools to monitor and use self-produced feedback (auditory, somatosensory) to evaluate success in achieving the target. Both the traditional and the biofeedback intervention put the child in the position of being dependent upon external feedback. The outcome measure focused attention on improvements from the beginning of the practice session to the end. The first principle of motor learning is that practice performance is not an indication of learning however.  The focus should have been on the sometimes large decrements in probe scores from the end of one session to the beginning of the next. The children had no means of maintaining any of those performance gains. Acoustic feedback may be a powerful means of establishing a new response but it is a counterproductive tool for maintenance and generalization learning.

Reading

McAllister Byun, T. (2017). Efficacy of Visual–Acoustic Biofeedback Intervention for Residual Rhotic Errors: A Single-Subject Randomization Study. Journal of Speech, Language, and Hearing Research, 60(5), 1175-1193. doi:10.1044/2016_JSLHR-S-16-0038

Rvachew, S., & Matthews, T. (2017). Demonstrating treatment efficacy using the single subject randomization design: A tutorial and demonstration. Journal of Communication Disorders, 67, 1-13. doi:https://doi.org/10.1016/j.jcomdis.2017.04.003

 

Testing Client Response to Alternative Speech Therapies

Buchwald et al published one of the many interesting papers in a recent special issue on motor speech disorders in the Journal of Speech, Language and Hearing Research. In their paper they outline a common approach to speech production, one that is illustrated and discussed in some detail in Chapters 3 and 7 of our book, Developmental Phonological Disorders: Foundations of Clinical Practice. Buchwald et al. apply it in the context of Acquired Apraxia of Speech however. They distinguish between patients who produce speech errors subsequent to left hemisphere cardiovascular accident as a consequence of motor planning difficulties versus phonological planning difficulties. Specifically, in their study there are four such patients, two in each subgroup. Acoustic analysis was used to determine whether their cluster errors arose during phonological planning or in the next stage of speech production – during motor planning. The analysis involves comparing the durations of segments in triads of words like this: /skæmp/ → [skæmp], /skæmp/ → [skæm], /skæm/ → [skæm]. The basic idea is that if segments such as [k] in /sk/ → [k] or [m] in /mp/ → [m] are produced as they would be in a singleton context, then the errors arise during phonological planning; alternatively, if they are produced as they would be in the cluster context, then the deletion errors arise during motor planning. This leads the authors to hypothesize that patients with these different error types would respond differently to intervention. So they treated all four patients with the same treatment, described as “repetition based speech motor learning practice”. Consistent with their hypothesis, the two patients with motor planning errors responded to this treatment and the two with phonological planning errors did not as shown in the table of pre- versus post-treatment results.

Buchwald et al results corrected table

However, as the authors point out, a significant limitation of this study is that the design is not experimental. Having failed to establish experimental control either within or across speakers it is difficult to draw conclusions.

I find the paper to be of interest on two accounts nonetheless. Firstly, their hypothesis is exactly the same hypothesis that Tanya Matthews and I posed for children who present with phonological versus motor planning deficits. Secondly, their hypothesis is fully compatible with the application of a single subject randomization design. Therefore it provides me with an opportunity to follow through with my promise from the previous blog, to demonstrate how to set up this design for clinical research.

For her dissertation research, Tanya identified 11 children with severe speech disorders and inconsistent speech sound errors who completed our full experimental paradigm. These children were diagnosed with either a phonological planning disorder or a motor planning disorder using the Syllable Repetition Task and other assessments as described in our recently CJSLPA paper, available open access here. Using those procedures, we found that 6 had a motor planning deficit and 5 had a phonological planning deficit.

Then we hypothesized that the children with motor planning disorders would respond to a treatment that targeted speech motor control: much like Brumbach et al., it included repetition practice according to the principles of motor practice during the practice parts of the session but during prepractice, children were taught to identify the target words and to identify mispronunciations of the target words so that they would be better able to integrate feedback and self-correct during repetition practice. Notice that direct and delayed imitation are important procedures in this approach. We called this the auditory-motor integration (AMI approach).

For children with Phonological Planning disorders we hypothesized that they would respond to a treatment similar to the principles suggested by Dodd et al (i.e., see core vocabulary approach). Specifically the children are taught to segment the target words into phonemes, associating the phonemes with visual cues. Then we taught the children to chain the phonemes back together into a single word. Finally, during the practice component of each session, we encouraged the children to produce the words using the visual cues when necessary. An important component of this approach is that auditory-visual models are not provided prior to the child’s production attempt-the child is forced to construct the phonological plan independently. We called this the phonological memory & planning (PMP) approach.

We also had a control condition that consisted solely of repetition practice (CON condition).

The big difference between our work and Brumbach et al. is that we tested our hypothesis using a single subject block randomization design, as described in our recent tutorial in Journal of Communication Disorders. The design was set up so that each of the 11 children experienced all three treatments. We chose 3 treatment targets for each child, randomly assigned the targets to each of the three treatments, and then randomly assigned the treatments to each of three sessions, scheduled to occur on different days of the week, 3 sessions per week for 6 weeks. You can see from the table below that each week counts as one block, so there are 6 blocks of 3 sessions for 18 sessions in total. The randomization scheme was generated blindly and independently using computer software for each child. The diagram below shows the treatment schedule for one of the children with a motor planning disorder.

Block Randomization TASC02 DPD Blog

This design allowed us to compare response to the three treatments within each child using a randomization test. For this child, the randomization test revealed a highly significant difference in favour of the AMI treatment as compared to the PMP treatment, as hypothesized for children with motor planning deficits. I don’t want to scoop Tanya’s thesis because she will finish it soon, before the end of 2017 I’m sure, but the long and the short of it is that we have a very clear results in favour of our hypothesis using this fully experimental design and the statistics that are licensed by it. I hope you will check out our tutorial on the application of this design: we show how flexible and versatile this design can be for addressing many different questions about speech-language practice. There is much exciting work being done in the area of speech motor control and this is a design that gives researchers and clinicians an opportunity to obtain interpretable results with small samples of children with rare or idiosyncratic profiles.

Reading

Buchwald, A., & Miozzo, M. (2012). Phonological and Motor Errors in Individuals With Acquired Sound Production Impairment. Journal of Speech, Language, and Hearing Research, 55(5), S1573-S1586. doi:10.1044/1092-4388(2012/11-0200)

Rvachew, S., & Matthews, T. (2017). Using the Syllable Repetition Task to Reveal Underlying Speech Processes in Childhood Apraxia of Speech: A Tutorial. Canadian Journal of Speech-Language Pathology and Audiology, 41(1), 106-126.

Rvachew, S., & Matthews, T. (2017). Demonstrating treatment efficacy using the single subject randomization design: A tutorial and demonstration. Journal of Communication Disorders, 67, 1-13. doi:https://doi.org/10.1016/j.jcomdis.2017.04.003

 

Single Subject Randomization Design For Clinical Research

Ebbels tweet Intervention ResearchDuring the week April 23 – 29, 2017 Susan Ebbels is curated WeSpeechies on the topic Carrying Out Intervention Research in SLP/SLT Practice. Susan kicked off the week with a link to her excellent paper that discusses the strengths and limitations of various procedures for conducting intervention research in the clinical setting. As we would expect, a parallel groups randomized control design was deemed to provide the best level of experimental control. Many ways of studying treatment related change within individual clients, with increasing degrees of control were also discussed. However, all of the ‘within participant’ methods described were vulnerable to confounding by threats to internal validity such history, selection, practice, fatigue, maturation or placebo effects to varying degrees.

One design was missing from the list because it is only just now appearing in the speech-language pathology literature, specifically the Single Subject Randomization Design. The design (actually a group of designs in which treatment sessions are randomly allocated to treatment conditions) provides the superior internal validity of the parallel groups randomized control trial by controlling for extraneous confounds through randomization. As an added benefit the results of a single subject randomization design can be submitted to a statistical analysis, so that clear conclusions can be drawn about the efficacy of the experimental intervention. At the same time, the design can be feasibly implemented in the clinical setting and is perfect for answering the kinds of questions that come up in daily clinical practice. For example, randomized control trials have shown than speech perception training is an effective adjunct to speech articulation therapy on average when applied to groups of children but you may want to know if it is a necessary addition to your therapy program for a speciRomeiser Logan Levels of Evidence SCRfic child.

Furthermore,  randomized single subject experiments are now acceptable as a high level of research evidence by the Oxford Centre for Evidence Based Medicine. An evidence hierarchy has been created for rating single subject trials, putting the randomized single subject experiments at the top of the evidence hierarchy as shown in the following table, taken from Romeiser Logan et al. 2008.

 

Tanya Matthews and I have written a tutorial showing exactly how to implement and interpret two versions of the Single Subject Randomization Design, a phase design and an alternation design. The accepted manuscript is available but behind a paywall at the Journal of Communication Disorders. In another post I will provide a mini-tutorial showing how the alternation design could be used to answer a clinical question about a single client.

Further Reading

Ebbels, Susan H. 2017. ‘Intervention research: Appraising study designs, interpreting findings and creating research in clinical practice’, International Journal of Speech-Language Pathology: 1-14.

Kratochwill, Thomas R., and Joel R. Levin. 2010. ‘Enhancing the scientific credibility of single-case intervention research: Randomization to the rescue’, Psychological Methods, 15: 124-44.

Romeiser Logan, L., R. Hickman, R.R. Harris, S.R. Harris, and C. Heriza. 2008. ‘Single-subject research design: recommendations for levels of evidence and quality rating’, Developmental Medicine and Child Neuroloogy, 50: 99-103.

Rvachew, S. 1988. ‘Application of single subject randomization designs to communicative disorders research’, Human Communication Canada (now Canadian Journal of Speech-Language Pathology and Audiology), 12: 7-13. [open access]

Rvachew, S. 1994. ‘Speech perception training can facilitate sound production learning.’, Journal of Speech and Hearing Research, 37: 347-57.

Rvachew, Susan, and Tanya Matthews. in press. ‘Demonstrating Treatment Efficacy using the Single Subject Randomization Design: A Tutorial and Demonstration’, Journal of Communication Disorders.

 

Single Subject Designs and Evidence Based Practice in Speech Therapy

I was really happy to see the tutorial on Single Subject Experimental designs in November’s issue of the American Journal of Speech-Language Pathology and Audiology, by Byiers, Reichle, and Symons. The paper does not really present anything new since it covers ground previously published by authors such as Kearns (1986). However, with the current focus on RCTs as the be-all and end-all for evidence based practice, it was a timely reminder that single-subject designs have a lot to offer for EPB in speech therapy. It really irritates me when I see profs tell their students that speech therapy practice does not have an evidentiary base: many of our standard practices are well grounded in good quality single subject research (not to mention some rather nice RCTs from the sixties as well but that is another story, maybe for another post).

Byiers et al. do a nice job of outlining the primary features of a valid single-subject experiment. The internal validity of the standard designs is completely dependent upon the stable baseline with no improving trend in the data prior to the introduction of the treatment. They indicate that “by convention, a minimum of three baseline data points are required to establish dependent measure stability.” Furthermore, it is essential to not see carry-over effects of treatment of one target to the second target prior to the introduction of treatment for the second target; in other words, performance on any given target must remain stable until treatment for that specific target is introduced. The internal validity of the experiment is voided when stable baselines for each target are not established and maintained throughout their respective baseline periods. This is true even for the multiple-probe design which is a variation on the multiple-baseline design in which the dependent measure is sampled at irregular intervals tied to the introduction of successive phases of the treatment program (as opposed to regular and repeated measurement  that occurs during each and every session of a multiple baseline design). Even with the multiple probe design, a series of closely spaced baseline probes are required at certain intervals to demonstrate stability of baselines just before you begin a new treatment phase. Furthermore, the design is an inappropriate choice unless a “strong a priori assumption of stability can be made” (see Horner and Baer, 1978).

I am interested in the multiple probe design because it is the preferred design of the research teams that claim that the “complexity approach” to target selection in phonology interventions is effective and efficient. However, it is clear that the design is not appropriate in this context (in fact, given the research question, I would argue that all single subject designs are inappropriate in this context).  The reasoning behind the complexity approach is that treating complex targets results in generalization of learning to less complex targets. This is supposed to be more efficient than treating the less complex targets first because these targets are expected to improve spontaneously without treatment (e.g., as a result of maturation) while not resulting in generalization to more complex targets. The problem of course is that improvements in less complex targets while you are treating a more complex one (especially when you get no improvement on the treatment target, see Cummings and Barlow, 2011) cannot be interpreted as a treatment effect. By the logic of a single-subject experiment, this outcome indicates that you do not have experimental control. To make matters worse, these improvements in generalization targets are often observed prior to the introduction of treatment –  and indeed the a priori assumption is that these improvements in less complex targets will occur without treatment – that is the whole rationale behind avoiding them as treatment targets! And therefore, by definition, both the multiple baseline and multiple probe designs are invalid approaches to the test of the complexity hypothesis. Without a randomized control trial one can only conclude that the changes observed in less complex targets in these studies are the result of maturation or history effects. (If you want to see what happens when you test the efficacy of the complexity approach using a randomized control trial, check out my publications: Rvachew & Nowak, 2001; Rvachew & Nowak, 2003; Rvachew, 2005; Rvachew & Bernhardt, 2010).

Some recent single subject studies have had some really nice outcomes for some children. Ballard, Robin and McCabe (2010) demonstrated an effective treatment for improving prosody in children with apraxia of speech, showing that work on pseudoword targets generalizes to real word dependent measures. Skelton (2004) showed that you can literally randomize your task sequence and get excellent results for the treatment of /s/ with carryover to the nonclinic environment (in other words you don’t have to follow the usual isolation-syllable- word-phrase-sentence sequence; rather, you can mix it up by practicing items with random difficulty level on every trial). Both of these studies showed uneven outcomes for different children however. Francoise and I suggested at ASHA2012 that the “challenge point framework” helps to explain variability in outcomes across children. The trick is to teach targets that are at the challenge point for the child – not uniformly complex but carefully selected to be neither too simple nor too complex for each individual child.

Both of these studies (Ballard et al. and the Skelton study) used a multiple baseline design. This design tends to encourage the selection of complex targets because consistent 0% correct is as stable as you can get in a baseline. If you want to pick targets that are at the “challenge point” you may be working on targets for which the child is demonstrating less stable performance. Fortunately there is a single subject design that does not require a stable baseline for internal validity – it is called a single subject randomization design. We are using two different variations on this design in our current study of different treatments for childhood apraxia of speech. I will describe our application of the design in another post.