What is a control group?

I have a feeling that my blog might become less popular in the next little while because you may notice an emerging theme on research design and away from speech therapy procedures specifically! But it is important to know how to identify evidence based procedures and to do that requires knowledge of research design and it has come to my attention, as part of the process of publishing two randomized control trials (RCTs) this past year, that there are a lot of misperceptions about what an RCT is in the SLP and education communities, among both clinicians and researchers. Therefore, I am happy to draw your attention to this terrific blog by Edzard Ernst, and in particular to an especially useful post “How to differentiate good from bad research”. The writer points out that a proper treatment of this topic “must inevitably have the size of a book” because each of the indicators that he provides “is far too short to make real sense.” So I have taken it upon myself in this blog to expand upon one of his indicators of good research – one that I know causes some confusion, specifically:

  • Use of a placebo in the control group where possible.

Recently the reviewers (and editor) of one of my studies was convinced that my design was not an RCT because the children in both groups received an intervention. In the absence of a “no-treatment control” they said, the study could not be an RCT! I was mystified about the source of this strange idea until I read Ernst’s blog and realized that many people, recalling their research courses from university, must be mistaking “placebo control” for “no-treatment control.” However, a placebo control condition is not at all like the absence of treatment. Consider the classic example of a placebo control: in a drug trial, the patients randomized to the treatment arm will visit the nurse who hands him or her a white paper cup holding 2 pink pills containing active ingredient X and some other ingredients that do not impact the patient’s disease, i.e., inactive ingredients; the patients randomized to the control arm will also visit the nurse who hands him or her a white paper cup holding 2 pink pills containing only the inactive ingredients. In other words, the experiment is designed so that all patients are “treated” exactly the same except that only patients randomized to treatment receive (unknowingly) the active ingredient. Therefore, all changes in patient behavior that are due to those aspects of the treatment that are not the active treatment (visiting the nice nurse, expecting the pills to make a difference etc.) are equalized across arms of the study. These are called the “common factors” or “nonspecific factors”.

In the case of a behavioral treatment it is important to equalize the common factors across all arms of the study. Therefore in my own studies I deliberately avoid “no treatment” controls. In my very first RCT (Rvachew, 1994) for example the treatment conditions in the two arms of the study were as follows;

  • Experimental: 10 minutes of listening to sheet vs Xsheet recordings and judging correct vs incorrect “sheet” items (active ingredient) in a computer game format followed by 20 minutes of traditional “sh” articulation therapy, provided by a person blind to the computer game target.
  • Control: 10 minutes of listening to Pete vs meat recordings and judging correct vs incorrect “Pete” items in a computer game format followed by 20 minutes of traditional “sh” articulation therapy, provided by a person blind to the computer game target.

It can be seen that the study was designed to ensure that all participants experienced exactly the same treatment except for the active ingredient that was reserved for children who were randomly assigned to the experimental treatment arm, specifically exposure to the experience of listening to and making perceptual judgments about a variety of correct and incorrect versions of words beginning with “sh” or distorted versions of “sh”-the sound that the children misarticulated. Subsequently I have conducted all my randomized control studies in a similar manner. But, as I said earlier, I run across readers who vociferously assert that the studies are not RCTs because an RCT requires a “no treatment” control. In fact, a “no treatment” control is a very poor control indeed as argued in this blog that explains why the frequently used “wait list control group” is inappropriate. For example, a recent trial on the treatment of tinnitus claimed that a wait list control had merit because “While this comparison condition does not control for all potential placebo effects (e.g., positive expectation, therapeutic contact, the desire to please therapists), the wait-list control does account for the natural passing of time and spontaneous remission.” In fact, it is impossible to control for common factors when using a wait list control and it is unlikely that patients are actually “just waiting” when you randomize them to the “wait list control” condition; therefore Hesser et al.’s defense of the wait list control is  optimistic although their effort to establish how much change you get in this condition is worthwhile.

We had experience with a “wait list” comparison condition in a recent trial (Rvachew & Brosseau-Lapré, 2015). Most of the children were randomly assigned to one of four different treatment conditions, matched on all factors except the specific active ingredients of interest. However, we also had a nonexperimental wait list comparison group* to estimate change for children outside of the trial. We found that parents were savvy about maximizing the treatment that their children could receive in any given year. Our trial lasted six weeks, the public health system entitled them to six weeks of treatment and their private insurance entitled them to six to 12 weeks of therapy depending on the plan. Parents would agree to enrolled their child in the trial with randomization to a treatment arm if their child was waiting for the public service, OR they would agree to be assessed in the “wait list” arm if their child was currently enrolled in the public service. They would use their private insurance when all other options had been exhausted. Therefore the children in the “wait list” arm were actually being treated. Interestingly, we found that the parents expected their children to obtain better results from the public service because it was provided by a “real” SLP rather than the student SLPs who provided our experimental treatments even though the public service was considerably less intense! (As an aside, we were not surprised to find that the reverse was true). Similarly, as I have mentioned in previous blogs, Yoder et al. (2005) found that the children in their “no treatment” control accessed more treatment from other sources than did the children in their treatment arm. And parents randomized to the “watchful waiting” arm of the Glogowska et al. (2000) trial sometimes dropped out because parents will do what they must to meet their child’s needs.

In closing, a randomized control trial is simply a study in which participants are randomly assigned to an experimental treatment and a control condition (even in a cross-over design, in which all participants experience all conditions, as in Rvachew et al., in press). The nature of the control should be determined after careful thought about the factors that you are attempting to control, which can be many – placebo, Hawthorne, fatigue, practice, history, maturation and so on. These will vary from trial to trial obviously. Placebo control does not mean “no treatment” but rather, a treatment that excludes everything except the “active ingredient” that is the subject of your trial. As an SLP, when you are reading about studies that test the efficacy of a treatment, you need to pay attention to what happens to the control group as well as the treatment group. The trick is to think in every case – what is the active ingredient that explains the effect seen in the treatment group? what else might account for the effects seen in the treatment arm of this study? If I implement this treatment in my own practice, how likely am I to get a better result compared to the treatment that my caseload is currently receiving?

* A colleague sent me a paper (Mercer et al., 2007) in which a large number of researchers advocating for the acceptance of a broader array of research designs in order to focus more attention on external validity and translational research, got together to discuss the merits of various designs. During the symposium it arose that there was disagreement about the use of the terms “control” and “comparison” group. I use the terms in accordance with a minority of their attendees, as follows: control group means that the participants were randomly assigned to a group that did not experience the “active ingredient” of the experimental treatment; comparison group means that the participants were not randomly assigned to the group that did not experience the experimental intervention, a group that may or may not have received a treatment. This definition was ultimately not used by the attendees, I don’t know why – somehow they decided on a different definition that didn’t make any sense at all, I invite you to consult p. 141 and see if you can figure it out!


Glogowska, M., Roulstone, S., Enderby, P., & Peters, T. (2000). Randomised controlled trial of community based speech and language therapy in preschool children. British Medical Journal, 321, 923-928.

Hesser, H., Weise, C., Rief, W., & Andersson, G. (2011). The effect of waiting: A meta-analysis of wait-list control groups in trials for tinnitus distress. Journal of Psychosomatic Research, 70(4), 378-384. doi:http://dx.doi.org/10.1016/j.jpsychores.2010.12.006

Mercer, S. L., DeVinney, B. J., Fine, L. J., Green, L. W., & Dougherty, D. (2007). Study Designs for Effectiveness and Translation Research: Identifying Trade-offs. American Journal of Preventive Medicine, 33(2), 139-154.e132. doi:http://dx.doi.org/10.1016/j.amepre.2007.04.005

Rvachew, S. (1994). Speech perception training can facilitate sound production learning. Journal of Speech and Hearing Research, 37, 347-357.

Rvachew, S., & Brosseau-Lapré, F. (2015). A randomized trial of twelve week interventions for the treatment of developmental phonological disorder in francophone children. American Journal of Speech-Language Pathology, 24, 637-658. doi:10.1044/2015_AJSLP-14-0056

Rvachew, S., Rees, K., Carolan, E., & Nadig, A. (in press). Improving emergent literacy with school-based shared reading: Paper versus ebooks. International Journal of Child-Computer Interaction. doi:http://dx.doi.org/10.1016/j.ijcci.2017.01.002

Yoder, P. J., Camarata, S., & Gardner, E. (2005). Treatment effects on speech intelligibility and length of utterance in children with specific language and intelligibility impairments. Journal of Early Intervention, 28(1), 34-49.

Thinking About ‘Dose’ and SLP Practice: Part II

I have been talking about whether it is helpful to think about dose-response relationships as an important aspect of treatment efficacy. During a recent @wespeechies exchange, we discussed whether this “medical” concept should be applied to speech therapy. One objection raised was the idea that treatment efficacy is “all about relationships” and therefore the dosage of specific inputs was not all that relevant to outcomes. In psychotherapy, objections to manualized care protocols that prescribe specific procedures for defined cases are also based on the notion that treatment efficacy is determined not by the specific ingredients of the treatment program but rather by common factors, as I discussed in a previous blog. One of the important common factors is the therapeutic alliance. How important is the therapeutic alliance to treatment outcomes? And does attention to the therapeutic alliance preclude thinking carefully about which procedures to use in which amounts with a given case?

In psychotherapy the therapeutic alliance is defined “as agreement on the goals and tasks of therapy in the context of a positive affective bond between patient and therapist.” Even when working with children, this can be an important aspect of the treatment program. For example, McCormack, McLeod, McAllister and Harrison describe children’s experience of speech impairment in a paper entitled “My Speech Problem, Your Listening Problem, My Frustration…”. This qualitative study illuminates multiple facets of an SSD and further shows that the child’s perspective and the adult’s perspective on the problem and the solution are often not aligned. Shifting the child’s attention to the role of his or her speech problem in communication breakdowns will require a genuine, caring, sensitive and trusting relationship between SLP and child. Establishing common goals and motivating the child to try new tasks to achieve those goals will also be highly dependent upon the therapeutic alliance between child and therapist.

To understand how the therapeutic alliance impacts on therapy outcomes we must return to the psychotherapy literature because I am aware of no scientific studies in the speech therapy arena that have addressed this issue directly. In mental health services, the strength of the therapeutic alliance is measured by asking clients questions about their relationship with their therapist in three domains, specifically goals (e.g., We agree on what is important for me to work on.), tasks (e.g., I agree the way we are working on my problem is correct), and bond (e.g., I believe my therapist likes me).  Very large sample studies have shown that the relationship between therapist and client accounts for about 20% of variance in outcomes. However, the relationship between outcomes and the therapeutic relationship is reciprocal: if the client gets better, they have more trust in the therapist’s guidance regarding goals and tasks. Therefore, the therapeutic relationship is theoretically independent of the techniques and procedures that the therapist uses, but in practice these variables may be related.

To put this in the speech therapy context again, Francoise Brosseau-Lapré and I are in the process of publishing the results of our RCT, Essai Clinique sur les Interventions Phonologique. We found that an input oriented approach (procedures focused on perceptual and phonological knowledge with very little articulatory practice) was as effective as an output oriented approach (all procedures focused on articulation practice) for improving children’s articulation accuracy.  Therefore, when working with a very shy child who does not like to imitate or indeed, talk at all, during speech therapy, you and the parent and the child might all agree that the input oriented approach is the ideal way to work on the child’s speech problem. Initially the therapeutic alliance might be high but what if the implementation of the approach is not competent? We find for example that it is actually quite difficult to teach students to implement the procedures (focused stimulation, error detection tasks and meaningful minimal pairs procedures) correctly. Furthermore we found that when procedures are mixed and matched in a way that is not theoretically coherent (for example, input oriented procedures in the clinic but an output oriented home practice program), we observed very poor outcomes. It is probable that in cases of poor implementation, outcomes and the therapeutic alliance will both suffer. At the very least, as I have found previously, parents are able to identify poor speech outcomes in their children even as they report good relationships with their child’s SLP.

This discussion reminds me of a very interesting article about teacher effectiveness that was circulated on twitter by @KevinWheldell. Gregory Yates makes the distinction between good teachers and effective teachers. Similarly SLPs may be readily judged to be good on the basis of personal and moral qualities such as warmth, caring, friendliness and conscientiousness, all of which contribute to positive relationships with clients, coworkers and their institution. Effectiveness requires the skillful application of specific techniques and procedures in relation to client needs however and can only be measured in reference to client outcomes. More about this in the next blogpost in this series.

Don’t get tricked: Why it pays to read original sources.

In my last blog post I suggested that you can have confidence in the effectiveness of your clinical practice if you select treatment practices that have been validated by research. Furthermore, I provided links to some resources for summaries of research evidence. In this blog post I want to caution that it is important to read the original sources and to view the summaries, including meta-analyses, with some skepticism. Excellent clinical practice requires a deep knowledge of the basic science that is the foundation for the clinical procedures that you are using. Familiarity with the details of the clinical studies that address the efficacy of those procedures is also essential. I will provide two examples where a lack of familiarity with those details has led to some perverse outcomes.

Two decades ago it was quite common for children who were receiving services from publically funded providers in Canada to receive 16-week blocks of intervention. Then we went through the recession of the nineties and there was much pressure on managers in health care to cut costs. Fey, Cleave, Long, and Hughes (1993) conveniently published an RCT demonstrating that a parent intervention was just as effective as direct intervention provided by the SLP to improve children’s expressive grammar – the icing on the cake was that the parent-provided service required half as many SLP hours as the direct SLP-provided service. All across Canada, direct service blocks were cut to 8 weeks and parent-consultation services were substituted for the direct therapy model. About a decade after that I made a little money myself giving workshops to SLPs on evidence based practiced. The audiences were always shocked when I presented the actual resource inputs for Fey et al.’s interventions: (1) direct SLP intervention –  cost = 40 hours per child over 20 weeks, versus (2) parent administered intervention – cost = 21 hours per child over 20 weeks. So you see, the SLPs had been had by their managers! The SLPs would have been better positioned to resist this harmful change in service delivery model if they had been aware of the source of the claim that you could halve your therapy time by implementing a home program and get the same result. I don’t know that our profession could have changed the situation by being more knowledgeable about the research on service delivery models because the political and financial pressures at the time were extreme – but at least we and our patients would have had a fighting chance!

Another reason that you have to be vigilant is that the authors of research summaries have been known to engage in some sleight of hand. An example of this is chapter on Complexity Approaches by Baker and Williams in the book Interventions for Speech Sound Disorders in Children. This book is pretty cool because each chapter describes a different approach  and is usually accompanied by a video demonstration. Each author was asked to identify all the studies that support the approach and put them on a “levels of evidence” table. As indicated in a previous blog post, the complexity approach to selecting targets for intervention is supposedly supported by a great many studies employing the multiple probe design which is a fairly low level of evidence because it does not control for maturation or history effects. In the Baker and Williams “levels of evidence” table all of these single subject studies are listed  so it looks pretty impressive. The evidence to support the approach looks even more impressive when you notice that two randomized controlled trials are shown at a higher level on the table. This table leads you to believe that the complexity approach is supported by a large amount of data and the highest level of evidence until you realize that neither of those two RCTs, Dodd et al. (2008) and Rvachew and Nowak (2001), support the complexity approach. Even when you read the text, it is not clear that these RCTs do not provide support for the approach because the authors are a bit wafflely about this fact.  Before I noticed this table I couldn’t understand why clinicians would tell me proudly that they were using the complexity approach because it is evidence based. It is pretty hard to keep up with the evidence when you have to watch out for tricks like this!

In the comments to my last blog post there were questions about how you can be sure that your treatment is leading to change that is better than maturation alone. An RCT is designed to answer just that question so I am going to discuss the results of Rvachew and Nowak (2001), as detailed in a later paper, Rvachew, S. (2005). Stimulability and treatment success. Topics in Language Disorders. Clinical Perspectives on Speech Sound Disorders, 25(3), 207-219. Unfortunately this paper is hard to get so a lot of SLPs are not aware of the implications of our findings for the central argument that motivates the use of the complexity approach to target selection.  Gierut (2007) grounds the complexity approach on learnability theory, paradoxically the notion that language is essentially unlearnable and thus the structure of language must be innately built in. Complex language inputs are necessary to trigger access to this knowledge. Because of the hierarchical structure of this built-in knowledge, exposure to complex structure will “unlock the whole”, having a cascading effect down through the system. On the other hand, she claims that “it has been shown that simpler input actually makes language learning more difficult because the child is provided with only partial information about linguistic structure (p. 8).”

We tested this hypothesis in our RCT. Each child received a 15 item probe of their ability produce all the consonants of English in initial, medial and final position of words. The phonemes that they had not mastered were then ordered according to productive phonological knowledge and developmental order. Michele Nowak selected potential treatment targets for each child from both ends of the continuum. I independently (blindly, without access to the child’s test information or knowledge of the targets that Michelle had selected) randomly assigned the child to treatment condition, either ME or LL. ME condition means that the child was treated for phonemes for which the child had most knowledge and which are usually early developing. LL condition means that the child was treated for phonemes for which the child had least productive phonological knowledge and which are usually late developing. The children were treated in two six week blocks with a change in treatment targets for the second block using the same procedure to select the targets. The figure below shows probe performance for several actual and potential targets per child: the phoneme being treated in a given block, the phoneme to be treated in the next block (or that was treated in the previous block) and the phonemes that would have been treated if the child had been assigned to the other treatment condition. As a clinician, I am interested in learning and retention of the treated phonemes, relative to maturation. As a scientist who is testing the complexity approach, Gierut is interested in cross-class generalization, regardless of whether the child learns the targeted phoneme. We can look at these two outcomes across the two groups.

Let’s begin with the question of whether the children learned the target phonemes and whether there is any evidence that this learning is greater than what we would see with maturation alone. In the chart, learning during treatment is shown by the solid lines whereas dotted lines indicate periods where those sounds were not being treated. A1 is the assessment before the first treatment block, A2 is the assessment after the first block and before the second block, and A3 is the last assessment after the second treatment block. On the left hand side, we see that the ME group was treated during the first block for phonemes that were mastered in one word position but not in the other two (average score of 6/15 prior to treatment). The slopes of the solid versus dotted lines show you that change from A1 to A2 was greater than change from A2 to A3. This means that these targets showed more change when they were being treated in the first block than when they were not being treated during the second block. During the second block, we treated slightly harder sounds that were not mastered in any word position, with a starting probe score of 3/15 on average. These phonemes improved from A1 to A2 even though they weren’t being treated but the rate of improvement is much higher between A2 and A3 when they were being treated. Interestingly, the slopes of the solid lines and the slopes of the dotted lines are parallel – this is your treatment effect – this is the proof that treatment is more effective than not treating. As further proof we can look at the results for the LL group. We have a similar situation with parallel solid and dotted lines for the phonemes that were treated in the first and second blocks at the bottom of the chart. We don’t have as much improvement for these phonemes because they were very difficult, unstimulable late developing sounds (targets that are consistent with the complexity approach). None-the-less the outcomes are better while the phonemes are being treated than when they are not (in fact there are slight regressions during the blocks when these sounds are not treated). At the same time, the phonemes for which the children have the most knowledge improve spontaneously (Gierut would attribute this change to cross-class generalization whereas I attribute this change to maturation). The interesting comparison however is across groups. Notice that the ME group shows a change of 4 points for treated “most knowledge” phonemes versus a change of 3 points for the untreated “most knowledge” phonemes for the LL group. This is not a very big difference but none-the-less, treating these phonemes results in slightly faster progress than not treating them.

In our 2001 paper we reported that progress for treated targets was substantially better for children in the ME condition than for children in the LL condition (in the latter group, the children remained unstimulable for 45% of targets after 6 weeks of therapy). However, the proponents of the complexity approach are not interested in this finding. If the child does not learn the hard target that is an acceptable price to pay if cross-class generalization occurs and the child learns easier untreated phonemes. If you look at the right hand side of the chart by itself, the chart can be taken as support for the complexity approach because spontaneous gains are observed for the “most knowledge” phonemes. The problem is that the proponents of this approach have argued that exposure to “simpler input actually makes language learning more difficult” – it is literally supposed to be impossible to facilitate learning of harder targets by teaching simpler targets. Therefore the real test of the complexity approach is not in the right hand chart. We have to compare the rate of change for the unstimulable targets across the two groups. It is apparent that the gain for UNTREATED unstimulable phonemes (ME group, gain = 2) is double that observed for TREATED unstimulable phonemes (LL group, gain = 1). The results shown on the left clearly show that treating the easier sounds first facilitated improvements for the difficult phonemes. I have explained this outcome by reference to dynamic systems theory in Rvachew and Bernhardt (2010). From my perspective, it is not just that my RCT shows that the complexity approach doesn’t work. It’s that my RCT is just part of a growing and broad based literature that invalidates the “learnability approach” altogether. Francoise and I describe and evaluate this evidence while promoting a developmental approach to phonology in our book Developmental Phonological Disorders: Foundations of Clinical Practice.


Probe Scores for Treated and Untreated Phonemes

Probe Scores for Treated and Untreated Phonemes









The larger point that I am trying to make here is that SLPs need to know the literature deeply. The evidence summaries tend to take a bit of a “horse race” approach, grading study quality on the basis of sometimes questionable checklists and then making conclusions on the basis of how many studies can be amassed at a given level of the evidence table. This is not always a clinically useful practice. It is necessary to understand the underlying theory, to know the details of the methods used in those studies, and to draw your own conclusions about the applicability of the treatments to your own patients. This means reading the original sources. In order to achieve this level of knowledge we need to reorganize our profession to encourage a greater number of specialists in the field because no individual SLP can have this depth of knowledge about every type of patient that you might treat. But it should be possible to encourage the development of specialists who are given the opportunity to stay current with the literature and provide consultation services to generalists on the front lines. Even if we could ensure that SLPs had access to the best evidence as a guide to practice however, there are some “common factors” that have a large impact on outcomes even when treatment approach is controlled. In my next post I will address the role of the individual clinician in ensuring excellent client outcomes.