Wednesday, August 05, 2020

The NEJM hydroxychloroquine study fails to notice its largest effect

Before hydroxychloroquine was a Donald Trump joke, the drug was considered a promising possibility for prevention and treatment of Covid-19. It had been previously shown to work against respiratory viruses in the lab, and, for decades, it was safely and routinely given to travellers before departing to malaria-infested regions. A doctor friend of mine (who, I am hoping, will have reviewed this post for medical soundness before I post it) recalls having taken it before a trip to India.

Travellers start on hydroxychloroquine two weeks before departure; this gives the drug time to build up in the body. Large doses at once can cause gastrointestinal side effects, but since hydroxychloroquine has a very long half-life in the body -- three weeks or so -- you build it up gradually.

For malaria, hydroxychloroquine can also be used for treatment. However, several recent studies have found it to be ineffective treating advanced Covid-19.

That leaves prevention. Can hydroxychloroquine be used to prevent Covid-19 infections? The "gold standard" would be a randomized double-blind placebo study, and we got one a couple of months ago, in the New England Journal of Medicine (NEJM). 

It concluded that there was no statistically significant difference between the treatment and placebo groups, and concluded

"After high-risk or moderate-risk exposure to Covid-19, hydroxychloroquine did not prevent illness compatible with Covid-19 or confirmed infection when used as postexposure prophylaxis within 4 days after exposure."

But ... after looking at the paper in more detail, I'm not so sure.


The study reported on 821 subjects who had been exposed, within the past four days, to a patient testing positive for Covid-19. They received a dose of either hydroxychloroquine or placebo for the next five days (the first day was a higher "loading dose"), and followed over the next couple of weeks to see if they contracted the virus.

The results:

49 of 414 treatment subjects (11.8%) became infected
58 of 407   placebo subjects (14.3%) became infected.

That's about 17 percent fewer cases in patients who got the real drug. 

But that wasn't a large enough difference to show statistical significance, with only about 400 subjects in each group. The paper recognizes that, stating the study was designed only with sufficient power to find a reduction of at least 50 percent, not the 17 percent reduction that actually appeared. Still, by the usual academic standards for this sort of thing, the authors were able to declare that "hydroxychloroquine did not prevent illness."

At this point I would normally rant about statistical significance and how "absence of evidence is not evidence of absence."  But I'll skip that, because there's something more interesting going on.


Recall that the study tested hydroxychloroquine on subjects who feared they were already exposed to the virus. That's not really testing prevention ... it's testing treatment, albeit early treatment. It does have elements of prevention in it, as perhaps the subjects may not have been infected at that point, but would be infected later. (The study doesn't say explicitly, but I would assume some of the exposures were to family members, so repeated exposures over the next two weeks would be likely.)

Also: it did take five days of dosing until the full dose of hydroxychloroquine was taken. That means the subject didn't get a full dose until up to nine days after exposure to the virus.

So this is where it gets interesting. Here's Figure 2 from the paper:

These lines are cumulative infections during the course of the study. As of day 5, there were actually more infections in the group that took hydroxychloroquine than in the group that got the placebo ... which is perhaps not that surprising, since the subjects hadn't finished their full doses until that fifth day. By day 10, the placebo group has caught up, but the groups are still about equal.

But now ... look what happens from Day 10 to Day 14. The group that got the hydroxychloroquine doesn't move much ... but the placebo group shoots up.

What's the difference in new cases? The study doesn't give the exact numbers that correspond to the graph, so I used a pixel ruler to measure the distances between points of the graph. It turns out that from Day 10 to Day 14, they found:

-- 11 new infections in the placebo group
--  2 new infections in the hydroxychloroquine group.

What is the chance that of 13 new infections, they would get split 11:2? 

About 1.12 percent one-tailed, 2.24 percent two-tailed.

Now, I know that it's usually not legitimate to pick specific findings out of a study ... with 100 findings, you're bound to find one or two random ones that fall into that significance level. But this is not an arbitrary random pattern -- it's exactly what we would have expected to find if hydroxychloroquine worked as a preventative. 

It takes, on average, about a week for symptoms to appear after COVID-19 infection. So for those subjects in the "1-5" group, most were probably infected *before* the start of their hydroxychloroquine regimen (up to four days before, as the study notes). So those don't necessarily provide evidence of prevention. 

In the "6-10" group, we'd expect most of them to have been already infected before the drugs were administered; the reason they were admitted to the study in the first place was because they feared they had been exposed. So probably many of those who didn't experience symptoms until, say, Day 9, were already infected but had a longer incubation period. Also, most of the subsequently-infected subjects in that group probably got infected in the first five days, while they didn't have a full dose of the drug yet.

But in the last group, the "11-14" group, that's when you'd expect the largest preventative effect -- they'd have had a full dose of the drug for at least six days, and they were the most likely to have become infected only after the start of the trial.

And that's when the hydroxychloroquine group had an 84 percent lower infection rate than the placebo group.


In everything I've been reading about hydroxychloroquine and this study, I have not seen anyone notice this anomaly, that beyond ten days, there were almost seven times as many infections among those who didn't get the hydroxychloroquine. In fact, even the authors of the study didn't notice. They stopped the study on the basis of "futility" once they realized they were not going to achieve statistical significance (or, in other words, once they realized the reduction in infections was much less than the 50% minimum they would endorse). In other words: they stopped the study just as the results were starting to show up! 

And then the FDA, noting the lack of statistical significance, revoked authorization to use hydroxychloroquine.

I'm not trying to push hydroxychloroquine here ... and I'm certainly not saying that I think it will definitely work. If I had to give a gut estimate, based on this data and everything else I've seen, I'd say ... I dunno, maybe a 15 percent chance. Your guess may be lower. Even if your gut says there's only a one-in-a-hundred chance that this 84 percent reduction is real and not a random artifact ... in the midst of this kind of pandemic, isn't even 1 percent enough to say, hey, maybe it's worth another trial?

I know hydroxychloroquine is considered politically unpopular, and it's fun to make a mockery of it. But these results are strongly suggestive that there might be something there. If we all agree that Trump is an idiot, and even a stopped clock is right twice a day, can we try evaluating the results of this trial on what the evidence actually shows? Can we not elevate common sense over the politics of Trump, and the straitjacket of statistical significance, and actually do some proper science?



At Wednesday, August 05, 2020 11:51:00 PM, Anonymous Anonymous said...

The error bars aren't there for decorative purposes.

At Thursday, August 06, 2020 12:20:00 AM, Blogger Phil Birnbaum said...

They are pretty, though, aren't they?

The error bars are there for the cumulative incidence. If there were error bars for only the difference between the third panel and the second panel, they would not intersect at p<.05, since the probability of 13 cases splitting 11-2 randomly would be less than 5 percent.

At Sunday, August 09, 2020 6:34:00 PM, Anonymous Anonymous said...

Hi, this is an interesting observation. You might also want to take a look at the supplemental data (fig s1) for another likely time-dependent effect... This time even easier to interpret.

At Sunday, August 09, 2020 6:46:00 PM, Blogger Phil Birnbaum said...

Nice! Thank you!


Post a Comment

<< Home