For better or worse: An individual patient data meta-analysis of deterioration among participants receiving Internet-based cognitive behavior therapy

During the last decades a number of systematic reviews and meta-analyses have provided increasing support for the use of psychological treatments as a way of alleviating mental distress and enhancing well-being (Cuijpers et al., 2013; Cuijpers, Smit, Bohlmeijer, Hollon, & Andersson, 2010; Hofmann, Asnaani, Vonk, Sawyer, & Fang, 2012). In addition, great efforts have been made to improve the access to evidence-based methods, such as, cognitive behavior therapy (CBT), in an effort to disseminate effective psychological treatments to patients suffering from a variety of psychiatric disorders (Clark et al., 2009; McHugh & Barlow, 2010; Shafran et al., 2009). Meanwhile, psychological treatments delivered via formats other than face-to-face, for instance, Internet-based cognitive behavior therapy (ICBT) and the use of different smartphone applications, have the potential of becoming an important and widely used addition to the health-care system, delivering evidence-based methods to an even larger population at a significantly lower cost (Andersson & Titov, 2014; Andrews, Cuijpers, Craske, McEvoy, & Titov, 2010; Hedman, Ljotsson, & Lindefors, 2012), and with similar benefits for many patients (Andersson, Rozental, Rück, & Carlbring, 2015; Cuijpers, Donker, van Straten, Li, & Andersson, 2010; Richards & Richardson, 2012). However, although promising steps have been made for helping those who suffer from a psychiatric disorder, research of psychological treatments have focused almost entirely on its positive aspects, particularly, average treatment outcome and the number of patients who have achieved clinically significant change, while paying far less attention to the possible existence of negative effects (Barlow, 2010). Few clinical trials of psychological treatments tend to report adverse events occurring during the treatment period (Berk & Parker, 2009; Linden & Schermuly-Haupt, 2014; Rozental et al., 2014), with a recent review indicating that information concerning risks was only described in 28 out of 132 (21%) published randomized controlled trials (Jonsson, Alaie, Parling, & Arnberg, 2014). In comparison to pharmacological research, clinical trials of psychological treatments were nine to twenty times less likely to mention possible or actual negative effects (Vaughan, Goldstein, Alikakos, Cohen, & Serby, 2014). The idea that some patients could experience adverse events due to the treatment they undergo has also largely been ignored throughout the history of psychological treatments, receiving little consideration by researchers and clinicians (Lilienfeld, 2007), even though the first empirical evidence on this issue was in fact published more than six decades ago (Powers & Witmer, 1951). One notable exception is Bergin (1966) who sparked a debate about the “client-deterioration phenomenon” (p. 236), or, the deterioration effect. In reviewing the results from seven clinical trials it was argued that, apart from those patients who improve and do not respond, some patients seem to get worse during the course of their psychological treatment. Albeit criticized for the difficulty of determining causality (c.f., May, 1971), several investigations have since then indicated that deterioration appears to be relatively common and occurs across psychiatric disorders and treatment conditions (c.f., Hansen, Lambert, & Forman, 2002), with an average number of deteriorated patients ranging between 5-10%, and with even higher rates among children, adolescents, and substance abuse patients (Lambert, 2013; Rhule, 2005; Swift, Callahan, Heath, Herbert, & Levine, 2010). With regards to ICBT, similar findings have recently been found in several randomized controlled trials (Boettcher, Rozental, Andersson, & Carlbring, 2014; Bruggerman Everts, van der Lee, & de Jager Meezenbroek, 2015; Kivi et al., 2014).

[lightbox link=”” thumb=”×761.png” width=”1024″ align=”center” title=”IPMA_JCCP_NEQ_ENG2″ frame=”true” icon=”image” caption=””]

Assessing negative effects

Deterioration, defined as a worsening in symptomatology, is not the only way to assess negative effects, and several other suggestions on how to define and monitor adverse events occurring during psychological treatments have been proposed (Mays & Franks, 1980; Mohr, 1995; Strupp & Hadley, 1977). For instance, decreased interpersonal functioning, dependency, and lowered self-esteem have all been put forward as detrimental effects of treatment, and could result in a better understanding of the mechanisms that may be responsible for negative effects if further scrutinized (Dimidjian & Hollon, 2010). For example, when reviewing the literature on negative effects possibly induced by psychological treatments, Boisvert and Faust (2002) discussed how labels, altered self-perceptions, and social roles might be related to a negative outcome. Similarly, Rozental, Boettcher, Andersson, Schmidt, and Carlbring (2015) conducted a qualitative content analysis on the responses to a number of open-ended questions concerning negative effects distributed in four clinical trials of ICBT, demonstrating that insight about what maintains a psychiatric disorder was experienced as distressing by some patients, as was the development of novel symptoms, such as insomnia and stress, difficulties implementing the treatment interventions, and a lack of feedback and guidance. However, although interesting from a theoretical perspective, such investigations often warrant the use of qualitative analyses that may prove hard to generalize. Furthermore, without the systematic use of standardized self-report measures explicitly probing for adverse events, as well as clearly defined and operationalized concepts of what constitutes different types of negative effects, the results would be difficult to interpret. Several suggestions on how to overcome some of these problems and enable the monitoring of negative effects other than deterioration have recently been put forward, including both therapist checklists and self-report measures distributed to the patient (Linden, 2013; Nestoriuc & Rief, 2012; Parker, Fletcher, Berk, & Paterson, 2013), but their use is currently limited and their validity is not yet established. Deterioration is therefore still one of the most straightforward methods for detecting and examining negative effects, with the additional advantages of being easy for researchers and clinicians to comprehend as well as allowing comparisons across clinical trials.

Assessing deterioration

Assessing deterioration can be a complex procedure that requires both theoretical and statistical considerations. Compared to investigating improvement, which typically involves the use of predefined cutoffs on a specific self-report measure or the calculation of what constitutes a clinically significant change, deterioration lacks a frequently used or agreed-upon approach for determining when the condition of a patient has declined. A negative change score from pre to post treatment may indicate that a patient has deteriorated, but by how many points that needs to be achieved is unclear (Mohr et al., 1990). This issue is further complicated by the fact that a patient cannot deteriorate indefinitely due to ceiling effects, as well as the problem of perceiving deterioration as a distribution of scores distinct from those of a dysfunctional and functional population (Martinovich, Saunders, & Howard, 1996). Jacobson, Follette, and Revenstorf (1984) were among the first to recognize these concerns, “There is no obvious counterpart to our distributional cutoff for clinical significance in the assessment of deterioration rates” (p. 350), suggesting that the investigation of deterioration is limited to the implementation of the Reliable Change Index (RCI), that is, inspecting whether the deterioration is reliable and not only caused by measurement error. The basic method for calculating the RCI was later refined by L. Christensen and Mendoza (1986) and outlined by Jacobson and Truax (1991) as the change score between pre and post treatment divided by the standard error of difference between the two test scores. If the resulting RCI is larger than z = 1.96, the change score would be considered unlikely (p = .05), without a true change really occurring. However, how the standard error of difference should be derived was never explicitly mentioned; only that it could be calculated from the standard error of measurement, creating some confusion on whether to use the internal consistency, such as, Cronbach’s α, or the test-retest reliability of the self-report measure, for instance, Pearson r. This issue has later been shown to have implications for determining the improvement and deterioration rates among patients receiving psychological treatment (Speer, 1992). Most notably, the internal consistency reflects if the self-report measure being used consists of a single unidimensional construct, that is, assuming that it measures only one factor or concept, while the test-retest reliability introduces variation because of separate occasions of measurement, resulting in lower reliability. Subsequently, if a less reliable self-report measure is being administered, the greater the actual change score has to be in order for it to be regarded as reliable (Evans, Margison, & Barkham, 1998). Thus, if the idea were to assess a relatively stable trait or feature, internal consistency would probably suffice, but if some fluctuation is expected to occur, as when examining symptoms of a given psychiatric disorder, test-retest reliability is recommended (Edwards, Yarvis, Mueller, Zingale, & Wagman, 1978). Furthermore, Tingey, Lambert, Burlingame, and Hansen (1996) argued that the test-retest reliability should be derived from a normal population and cover a short time frame so that it represents the measurement error of scores from a specific interval, preferably one to two weeks, rather than any potential change that could be attributed to a psychological treatment. In the case of several cases of test-retest reliabilities of the same self-report measure, it is also suggested that the median number should be used. However, acquiring the necessary information for a specific self-report measure can often be difficult, especially for non-clinical samples. Furthermore, different alterations to the RCI have been presented over the years, taking into account regression to the mean (Speer, 1992) as well as correcting for error at both pre and post treatment (Hageman & Arrindell, 1993), complicating the issue further, although a comparison of various methods by Bauer, Lambert, and Nielsen (2004) still recommends the original suggestion by Jacobson and Truax (1991). In addition, the RCI has almost entirely been used for investigating the number of patients having improved rather than deteriorated (Hiller, Schindler, & Lambert, 2012). Hence, in determining deterioration the same standard deviations units have been used as for assessing recovery, that is, z = 1.96, even though it could be argued that a less strict criterion should be implemented in order to account for those who also experience milder forms of deterioration. Wise (2004), therefore, proposed several reliable change indexes to discriminate between different confidence levels related to deterioration, z = 1.28 for moderate deterioration (p = .10), as well as z = 0.84 for mild deterioration (p = .20), which could reveal negative effects that might otherwise have been overlooked. In other words, when examining deterioration among patients receiving psychological treatment, one need to recognize a number of issues related to the self-report measure being distributed, what standard error of measurement is available to obtain, as well as what reliable change index to use, before calculating the RCI.

Individual patient data meta-analysis

Deterioration in itself is insufficient in determining what factors might be contributing to its occurrence. In order to understand why some patients fare worse during the course of their psychological treatment, research on possible predictors of deterioration is required (Castonguay, Boswell, Constantino, Goldfried, & Hill, 2010). However, due to the relatively small number of patients actually deteriorating in a single clinical trial, statistical analyses will often be underpowered to find meaningful differences. As mentioned by Edwards et al. (1978), “Very large samples of patients would have to be used to develop a large enough group for reliable determination of predictors of deterioration” (p. 286). Thus, to enable a more rigorous study of predictors of deterioration and discover potential subgroups that are at risk of becoming worse, large amounts of data are needed (Koopman, van der Heijden, Glasziou, Grobbee, & Rovers, 2007). Individual patient data meta-analysis (IPDM) is an approach to synthesize information from a number of clinical trials, using the raw scores from each patient for a more powerful examination of effects (Oxman, Clarke, & Stewart, 1995). By collecting data from several studies it is possible to undertake much more sophisticated statistical analyses and deal with some of the difficulties associated with investigating less frequently occurring events (L. Stewart & Clarke, 1995), such as, deterioration. In relation to ICBT, this approach has previously been used to assess the influence of the baseline severity level of depression on the effectiveness of treatment interventions delivered via the Internet, indicating that the more clinically severe patients benefit from their psychological treatment as much as those with less severity (Bower et al., 2013). Similarly, Karyotaki et al. (2015) found that dropout can be predicted by a number of variables, particularly, male gender, lower educational level, younger age, and comorbid anxiety. However, with regard to deterioration in ICBT, no previous attempt has been made to examine its occurrence or potential predictors using IPDM. The current study is therefore, to the knowledge of the authors, the first to assemble data from numerous clinical trials of ICBT for different psychiatric disorders in order to explore what factors might be related to deterioration during CBT delivered via the Internet. The primary aim is to determine the deterioration rates among patients receiving ICBT and to investigate plausible predictors of deterioration using variables that, theoretically or empirically, have been suggested to increase the risk of faring worse. Secondly, similar objectives are intended for those allocated to some form of control condition. As stated by Edwards et al. (1978), “The isolation of factors predicting or identifying a high risk for deterioration would allow intake screening to route these types of patients to alternative treatments” (p. 279), which would help researchers and clinicians pinpoint patients that are less suitable for ICBT or who may warrant more attention and guidance from a therapist.


An individual patient data meta-analysis of 29 clinical trials of ICBT (N = 2866) was performed using the Reliable Change Index for each primary outcome measures to distinguish deterioration rates among patients in treatment and control conditions. Statistical analyses of predictors were conducted using generalized linear mixed models. Missing data was handled by multiple imputation.


Deterioration rates were 122 (5.8%) in treatment and 130 (17.4%) in control conditions. Relative to receiving treatment, patients in a control condition had higher odds of deteriorating, Odds Ratios (OR) 3.10, 95% Confidence Interval (CI) [2.21-4.34]. Clinical severity at pre treatment was related to lower odds, OR 0.62, 95% CI [0.50-0.77], and 0.51, 95% CI [0.51-0.80], for treatment and control conditions. In terms of sociodemographic variables, being in a relationship, 0.58, 95% CI [0.35-0.95], having at least a university degree, 0.54, 95% CI [0.33-0.88], and being older, 0.78, 95% CI, [0.62-0.98], were also associated with lower odds of deterioration, but only for patients assigned to a treatment condition.


Deterioration among patients receiving ICBT or being in a control condition can occur and should be monitored by researchers in order to reverse and prevent a negative treatment trend.

Read the full paper (available soon):

Rozental, A., Magnusson, K., Boettcher, J., Andersson, G., & Carlbring, P. (in press). For better or worse: An individual patient data meta-analysis of deterioration among participants receiving Internet-based cognitive behavior therapy. Journal of Consulting and Clinical Psychology.