Simultaneous Analysis of Multiple Clinical Trial Endpoints Is Feasible, Study Finds

Marisa Wexler, MS avatar

by Marisa Wexler, MS |

Share this article:

Share article via email
SRP-9001 gene therapy update

Translarna (ataluren) significantly improves physical fitness in people with Duchenne muscular dystrophy (DMD) and Becker muscular dystrophy (BMD), a study examining multiple endpoints across two trials shows.

The study used a statistical strategy for analyzing multiple trial endpoints together, a tactic that could improve the interpretation of a treatment’s overall effect, particularly in studies of rare diseases.

The study, “Assessment of Treatment Effect With Multiple Outcomes in 2 Clinical Trials of Patients With Duchenne Muscular Dystrophy,” was published in JAMA Network Open. It was funded in part by PTC Therapeutics, the company developing Translarna.

Clinical trials have several endpoints — tests used to measure whether a treatment is conferring a benefit. For example, the six-minute walk test (6MWD), which measures  the distance a person can walk in six minutes, is commonly used as an endpoint to measure changes in physical fitness and exercise tolerance.

Typically, a trial will have one or two primary endpoints and multiple secondary endpoints, and these will be analyzed separately. But doing these analyses separately risks having confusing results, where statistically significant findings are found for certain endpoints, but not others.

This can be a particular problem with rare conditions, where small numbers of participants in trials limit the statistical power of analyses. In the new study, researchers demonstrated a statistical method for analyzing multiple trial endpoints collectively.

They performed their analysis using data from two clinical trials of Translarna (NCT00592553 and NCT01826487). In both trials, people with DMD or BMD were randomly assigned to treatment with Translarna or a placebo. Between the two trials, 171 people were treated with Translarna, and an equal number received placebo.

The treatment is intended to improve muscle function by increasing the production of the dystrophin protein. The primary endpoint for both trials was the 6MWT. Secondary endpoints included changes in time to walk or run 10 meters (about 33 feet), time to climb four stairs, and time to descend four stairs.

Results from the trials were reported previously, but the interpretation of the results wasn’t fully clear. Some endpoints were significant, while others were not. This likely contributed to PTC’s failure to receive approval for the treatment from the U.S. Food and Drug Administration in 2017. due to insufficient evidence of efficacy.

In attempting to analyze the four endpoints together, researchers faced an immediate statistical problem: the 6MWT measured distance (in meters), while the other three assessments measure time (in seconds). It was necessary to calculate a unitless statistical measure for all the measurements, which could then be combined by averaging.

To do this, the researchers calculated the z score — a measurement of how far a data point is from the average — for every endpoint.

The average z score across all eight endpoints (four in each trial) was 1.64. This suggests a meaningful effect, since, “if there were no differences between the two groups, each z score would be near zero randomly,” the researchers wrote.

To test statistical significance, the researchers performed a permutation test. Basically, for each endpoint, they took all the scores from both groups, pooled them, then randomly divided them into two new groups (regardless of treatment status), and repeated the statistical analysis.

By doing this one million times, the researchers could calculate how often a score as high as 1.64 would come up: if there was a true treatment effect, then such a score would be relatively rare, since it would require that the two groups be separated by treatment status as a result of random chance.

The analysis suggested that an average z score of 1.64 was statistically significant, “meaning that ataluren is statistically significantly better than placebo,” the researchers wrote.

The study supports the efficacy of Translarna, and more broadly, provides a template for analyzing multiple study endpoints collectively, which could be applied in future clinical trials.

The researchers said such analysis requires thoughtful selection of the endpoints to be analyzed when a trial is planned. For example, they said, their analysis would not be applicable to binary endpoints like survival, where there are only two possible outcomes, rather than a range of values.