Symposia
Research Methods and Statistics
Alexander O. Crenshaw, Ph.D. (he/him/his)
Clinical Research Psychologist
Toronto Metropolitan University
Toronto, Ontario, Canada
Alexander O. Crenshaw, Ph.D. (he/him/his)
Clinical Research Psychologist
Toronto Metropolitan University
Toronto, Ontario, Canada
Candice M. Monson, Ph.D. (she/her/hers)
Professor
Toronto Metropolitan University
Toronto, Ontario, Canada
When evaluating interventions in clinical trials, standard practice is to report standardized effect sizes and reliable change indices. Standardized effect sizes convert change from raw units of an outcome measure into standard deviation units, whereas reliable change quantifies the number of individuals in a treatment who changed to an extent greater than expected by chance. Both indices are important and complement one another: standardized effect sizes quantify average group change or average group difference in change, whereas reliable change quantifies the number of individuals who have experienced change to a statistically meaningful degree. In principle, these metrics provide uniform criteria for comparing intervention effects across different studies and outcomes. However, current practice involves using sample-specific estimates to create the criteria upon which standardized effect sizes are created and reliable change is evaluated. Sample-specific estimates are inherently subject to sampling error. Consequently, these criteria are also subject to sampling error and therefore differ across studies and outcomes. We call this phenomenon the “variable ruler problem,” in which the criteria—or “rulers”—for evaluating standardized effect sizes and reliable change vary across studies due to sampling error alone.
This talk introduces the variable ruler problem and shows common practices for computing effect sizes and reliable change can lead to variable rulers when evaluating clinical trials. We present the results of Monte Carlo simulations demonstrating the impact of sampling error on the variable ruler problem. Under common scenarios, average deviation in standardized effect size estimates ranged from 5% to 23.6% based on sampling error in the sample standard deviation alone. Average deviation in the reliable change cut-off ranged from 5% to 105%. Deviation was largest in small samples. Of most impact was the reliability of the outcome measure: sampling error in this estimate had profound effects on the reliable change cut-off, particularly for measures with high reliability. We provide novel recommendations for future studies to improve the comparability of standardized effect size and reliable change indices across studies and outcomes. These recommendations aim to better standardize the “rulers” we use when evaluating clinical interventions.