What is a control group?

I have a feeling that my blog might become less popular in the next little while because you may notice an emerging theme on research design and away from speech therapy procedures specifically! But it is important to know how to identify evidence based procedures and to do that requires knowledge of research design and it has come to my attention, as part of the process of publishing two randomized control trials (RCTs) this past year, that there are a lot of misperceptions about what an RCT is in the SLP and education communities, among both clinicians and researchers. Therefore, I am happy to draw your attention to this terrific blog by Edzard Ernst, and in particular to an especially useful post “How to differentiate good from bad research”. The writer points out that a proper treatment of this topic “must inevitably have the size of a book” because each of the indicators that he provides “is far too short to make real sense.” So I have taken it upon myself in this blog to expand upon one of his indicators of good research – one that I know causes some confusion, specifically:

  • Use of a placebo in the control group where possible.

Recently the reviewers (and editor) of one of my studies was convinced that my design was not an RCT because the children in both groups received an intervention. In the absence of a “no-treatment control” they said, the study could not be an RCT! I was mystified about the source of this strange idea until I read Ernst’s blog and realized that many people, recalling their research courses from university, must be mistaking “placebo control” for “no-treatment control.” However, a placebo control condition is not at all like the absence of treatment. Consider the classic example of a placebo control: in a drug trial, the patients randomized to the treatment arm will visit the nurse who hands him or her a white paper cup holding 2 pink pills containing active ingredient X and some other ingredients that do not impact the patient’s disease, i.e., inactive ingredients; the patients randomized to the control arm will also visit the nurse who hands him or her a white paper cup holding 2 pink pills containing only the inactive ingredients. In other words, the experiment is designed so that all patients are “treated” exactly the same except that only patients randomized to treatment receive (unknowingly) the active ingredient. Therefore, all changes in patient behavior that are due to those aspects of the treatment that are not the active treatment (visiting the nice nurse, expecting the pills to make a difference etc.) are equalized across arms of the study. These are called the “common factors” or “nonspecific factors”.

In the case of a behavioral treatment it is important to equalize the common factors across all arms of the study. Therefore in my own studies I deliberately avoid “no treatment” controls. In my very first RCT (Rvachew, 1994) for example the treatment conditions in the two arms of the study were as follows;

  • Experimental: 10 minutes of listening to sheet vs Xsheet recordings and judging correct vs incorrect “sheet” items (active ingredient) in a computer game format followed by 20 minutes of traditional “sh” articulation therapy, provided by a person blind to the computer game target.
  • Control: 10 minutes of listening to Pete vs meat recordings and judging correct vs incorrect “Pete” items in a computer game format followed by 20 minutes of traditional “sh” articulation therapy, provided by a person blind to the computer game target.

It can be seen that the study was designed to ensure that all participants experienced exactly the same treatment except for the active ingredient that was reserved for children who were randomly assigned to the experimental treatment arm, specifically exposure to the experience of listening to and making perceptual judgments about a variety of correct and incorrect versions of words beginning with “sh” or distorted versions of “sh”-the sound that the children misarticulated. Subsequently I have conducted all my randomized control studies in a similar manner. But, as I said earlier, I run across readers who vociferously assert that the studies are not RCTs because an RCT requires a “no treatment” control. In fact, a “no treatment” control is a very poor control indeed as argued in this blog that explains why the frequently used “wait list control group” is inappropriate. For example, a recent trial on the treatment of tinnitus claimed that a wait list control had merit because “While this comparison condition does not control for all potential placebo effects (e.g., positive expectation, therapeutic contact, the desire to please therapists), the wait-list control does account for the natural passing of time and spontaneous remission.” In fact, it is impossible to control for common factors when using a wait list control and it is unlikely that patients are actually “just waiting” when you randomize them to the “wait list control” condition; therefore Hesser et al.’s defense of the wait list control is  optimistic although their effort to establish how much change you get in this condition is worthwhile.

We had experience with a “wait list” comparison condition in a recent trial (Rvachew & Brosseau-Lapré, 2015). Most of the children were randomly assigned to one of four different treatment conditions, matched on all factors except the specific active ingredients of interest. However, we also had a nonexperimental wait list comparison group* to estimate change for children outside of the trial. We found that parents were savvy about maximizing the treatment that their children could receive in any given year. Our trial lasted six weeks, the public health system entitled them to six weeks of treatment and their private insurance entitled them to six to 12 weeks of therapy depending on the plan. Parents would agree to enrolled their child in the trial with randomization to a treatment arm if their child was waiting for the public service, OR they would agree to be assessed in the “wait list” arm if their child was currently enrolled in the public service. They would use their private insurance when all other options had been exhausted. Therefore the children in the “wait list” arm were actually being treated. Interestingly, we found that the parents expected their children to obtain better results from the public service because it was provided by a “real” SLP rather than the student SLPs who provided our experimental treatments even though the public service was considerably less intense! (As an aside, we were not surprised to find that the reverse was true). Similarly, as I have mentioned in previous blogs, Yoder et al. (2005) found that the children in their “no treatment” control accessed more treatment from other sources than did the children in their treatment arm. And parents randomized to the “watchful waiting” arm of the Glogowska et al. (2000) trial sometimes dropped out because parents will do what they must to meet their child’s needs.

In closing, a randomized control trial is simply a study in which participants are randomly assigned to an experimental treatment and a control condition (even in a cross-over design, in which all participants experience all conditions, as in Rvachew et al., in press). The nature of the control should be determined after careful thought about the factors that you are attempting to control, which can be many – placebo, Hawthorne, fatigue, practice, history, maturation and so on. These will vary from trial to trial obviously. Placebo control does not mean “no treatment” but rather, a treatment that excludes everything except the “active ingredient” that is the subject of your trial. As an SLP, when you are reading about studies that test the efficacy of a treatment, you need to pay attention to what happens to the control group as well as the treatment group. The trick is to think in every case – what is the active ingredient that explains the effect seen in the treatment group? what else might account for the effects seen in the treatment arm of this study? If I implement this treatment in my own practice, how likely am I to get a better result compared to the treatment that my caseload is currently receiving?

* A colleague sent me a paper (Mercer et al., 2007) in which a large number of researchers advocating for the acceptance of a broader array of research designs in order to focus more attention on external validity and translational research, got together to discuss the merits of various designs. During the symposium it arose that there was disagreement about the use of the terms “control” and “comparison” group. I use the terms in accordance with a minority of their attendees, as follows: control group means that the participants were randomly assigned to a group that did not experience the “active ingredient” of the experimental treatment; comparison group means that the participants were not randomly assigned to the group that did not experience the experimental intervention, a group that may or may not have received a treatment. This definition was ultimately not used by the attendees, I don’t know why – somehow they decided on a different definition that didn’t make any sense at all, I invite you to consult p. 141 and see if you can figure it out!


Glogowska, M., Roulstone, S., Enderby, P., & Peters, T. (2000). Randomised controlled trial of community based speech and language therapy in preschool children. British Medical Journal, 321, 923-928.

Hesser, H., Weise, C., Rief, W., & Andersson, G. (2011). The effect of waiting: A meta-analysis of wait-list control groups in trials for tinnitus distress. Journal of Psychosomatic Research, 70(4), 378-384. doi:http://dx.doi.org/10.1016/j.jpsychores.2010.12.006

Mercer, S. L., DeVinney, B. J., Fine, L. J., Green, L. W., & Dougherty, D. (2007). Study Designs for Effectiveness and Translation Research: Identifying Trade-offs. American Journal of Preventive Medicine, 33(2), 139-154.e132. doi:http://dx.doi.org/10.1016/j.amepre.2007.04.005

Rvachew, S. (1994). Speech perception training can facilitate sound production learning. Journal of Speech and Hearing Research, 37, 347-357.

Rvachew, S., & Brosseau-Lapré, F. (2015). A randomized trial of twelve week interventions for the treatment of developmental phonological disorder in francophone children. American Journal of Speech-Language Pathology, 24, 637-658. doi:10.1044/2015_AJSLP-14-0056

Rvachew, S., Rees, K., Carolan, E., & Nadig, A. (in press). Improving emergent literacy with school-based shared reading: Paper versus ebooks. International Journal of Child-Computer Interaction. doi:http://dx.doi.org/10.1016/j.ijcci.2017.01.002

Yoder, P. J., Camarata, S., & Gardner, E. (2005). Treatment effects on speech intelligibility and length of utterance in children with specific language and intelligibility impairments. Journal of Early Intervention, 28(1), 34-49.

Do our patients prove that speech therapy works?

The third post in my series on Evidence Based Practice versus Patient Centred Care addresses the notion that the best source of evidence for patient centred care comes from the patient. I recall that when I was a speech-language pathology student in the 1970s, my professors were fond of telling us that we needed to treat each patient as a “natural experiment”. I was reminded of this recently when a controversy blew up in Canada about a study on Quebec’s universal daycare subsidy and the author of the study described the introduction of the subsidy as a “natural experiment” and then this same economist went on to show himself completely confused about the nature of experiments! So, if you will forgive me, I am going to take a little detour through this study about daycare before coming back to the topic of speech therapy with the goal of demonstrating why your own clients are not always the best source of evidence about whether your interventions are working or not, as counter-intuitive as this may seem.

Quebec introduced a universal daycare program in 1997 and a group of economists have published a few evaluations using data from the National Longitudinal Study of Children and Youth (NLSCY), one looking at anxiety in younger kids  and the more recent one describing crime rates when the kids were older . The studies are rather bizarre in that children who access daycare (or not) do not provide data for these studies – rather province wide estimates of variables such as likelihood of using daycare and childhood anxiety are obtained from the NLSCY which is a survey of 2000 children from across Canada, obtained biannually but followed longitudinally; then they estimated province wide youth criminal activity from a completely different survey rather than using the self-report measures from the NLSCY. Differences in these estimates (see post-script) from pre-daycare cohorts to post-daycare cohorts are compared for Quebec versus the ROC (rest of Canada, which does not have any form of universal childcare program). One author described the outcome this way: “looking at kids in their teens, we find indicators of health and life satisfaction got worse, along with teens being in more trouble with the law.” The statistical analysis and design are so convoluted I was actually hoodwinked into thinking youth crime was going up in Quebec, when in fact youth crime was actually declining, just not as fast as in the ROC. In actual fact, youth crime legislation and practices vary so dramatically across provinces, and particularly between Quebec and the ROC that it is difficult indeed to compare rates of youth crime using the variable cited in the NBER paper (rates of accused or convicted youths; for discussion see Sprott). Then they attribute this so-called rise but actual decline in crime to “the effects of a sizeable negative shock to non-cognitive skills due to the introduction of universal child care in Quebec”. Notwithstanding this nonsense summary of the results of these really weird studies, the most inaccurate thing that Milligan said is that this study was a “natural experiment” which is “akin to a full randomized experiment such as Perry Preschool, but on a larger scale”. But the thing is, a “natural experiment” is not an experiment at all because when the experiment is natural, you cannot determine the cause of the events that you are observing (although when you have enough high quality pairs of data points you can sometimes make decent inferences, NOT the case in this particular study). The economists know how to observe and describe naturally occurring events. They can estimate an increase in daycare use and changing rates of child anxiety and youth crime convictions in Quebec vs the ROC and compare changing rates of things between these jurisdictions. What they cannot do is determine why daycare use changed or reported anxiety changed or convictions for youth crime changed. To answer the question “why”, you need an experiment. What’s more, experiments can only answer part of the “why” question.

So let’s return to the topic of speech therapy. We conduct small scale randomized control trials in my lab precisely because we want to answer the “why” question. We describe changes in children’s behavior over time but we also want to know whether one or more of our interventions were responsible for any part of that change. In our most recently published RCT we found that even children who did not receive treatment for phonological awareness improved in this skill, but children who received two of our experimental interventions improved significantly more. Control group children did not change at all for articulation accuracy whereas experimental group children did improve significantly. In scatterpots posted on my blog, we also showed that there are individual differences among children in the amount of change that occurs within the control group that did not experience the experimental treatments and within the experimental groups.  Therefore, we know that there are multiple influences on child improvement in phonological awareness and articulation accuracy, but our experimental treatments account for the greater improvement in the experimental groups relative to the control group. We can be sure of this because of the random assignment of children to treatments which controls for history and maturation effects and other potential threats to the internal validity of our study. How do we apply this information as speech-language pathologists when we are treating children, one at a time.

When a parent brings a child for speech therapy it is like a “natural experiment”. The parent and maybe the child are concerned about the child’s speech intelligibility and social functioning. The parent and the child are motivated to change. Coming to speech therapy is only one of the changes that they make and given long waits for the service it is probably the last in a series of changes that the family makes to help the child. Mum might change her work schedule, move the child to a new daycare, enlist the help of the grandparent, enroll the child in drama classes, read articles on the internet, join a support group, begin asking her child to repeat incorrect words, check out alliteration books from the library and so on. Most importantly, the child gets older. Then he starts speech therapy and you put your shiny new kit for nonspeech oral motor exercises to use. Noticing that the child’s rate of progress picks up remarkably relative to the six month period preceding the diagnostic assessment, you believe that this new (for you) treatment approach “works”.

What are the chances? It helps to keep in mind that a “natural experiment” is not an experiment at all. You are in the same position as the economists who observed historical change in Quebec and then tried to make causal inferences. One thing they did was return to the randomized control trial literature, ironically citing the Perry Preschool Project which proved that a high quality preschool program reduced criminality in high risk participants. On the other hand, most RCTs find no link between daycare attendance and criminal behavior at all. So their chain of causal inferences seems particularly unwise. In the clinical case you know that the child is changing, maybe even faster than a lot of your other clients. You don’t know which variable is responsible for the change. But you can guess by looking at the literature. Are there randomized controlled trials indicating that your treatment procedures cause greater change relative to a no-treatment or usual care control group? If so, you have reason for optimism. If not, as in the case of nonspeech oral motor exercises, you are being tricked by maturation effects and history effects. If you have been tricked in this way you shouldn’t feel bad because I know some researchers who have mistaken history and maturation effects for a treatment effect. We should all try to avoid this error however if we are to improve outcomes for people with communication difficulties.


PS If you are interested in the difference-in-difference research method, here is a beautiful youtube video about this design, used to assess handing out bicycles to improve school attendance by girls in India. In this case the design includes three  differences (difference-in-difference-in-difference design) and the implementation is higher quality all round compared to the daycare study that I described. Nonetheless, even here, a randomized control trial would be more convincing.