Ovarian cancer and HRT

Yesterday’s big health story in the news was the finding that HRT ‘increases ovarian cancer risk’. The scare quotes there, of course, tell us that that’s probably not really true.

So let’s look at the study and see what it really tells us. The BBC can be awarded journalism points for linking to the actual study in the above article, so it was easy enough to find the relevant paper in the Lancet.

This was not new data: rather, it was a meta-analysis of existing studies. Quite a lot of existing studies, as it turns out. The authors found 52 epidemiological studies investigating the association between HRT use and ovarian cancer. This is quite impressive. So despite ovarian cancer being a thankfully rare disease, the analysis included over 12,000 women who had developed ovarian cancer. So whatever other criticisms we might make of the paper, I don’t think a small sample size is going to be one of them.

But what other criticisms might we make of the paper?

Well, the first thing to note is that the data are from epidemiological studies. There is a crucial difference between epidemiological studies and randomised controlled trials (RCTs). If you want to know if an exposure (such as HRT) causes an outcome (such as ovarian cancer), then the only way to know for sure is with an RCT. In an epidemiological study, where you are not doing an experiment, but merely observing what happens in real life, it is very hard to be sure if an exposure causes an outcome.

The study showed that women who take HRT are more likely to develop ovarian cancer than women who don’t take HRT. That is not the same thing as showing that HRT caused the excess risk of ovarian cancer. It’s possible that HRT was the cause, but it’s also possible that women who suffer from unpleasant menopausal symptoms (and so are more likely to take HRT than those women who have an uneventful menopause) are more likely to develop ovarian cancer. That’s not completely implausible. Ovaries are a pretty relevant organ in the menopause, and so it’s not too hard to imagine some common factor that predisposes both to unpleasant menopausal symptoms and an increased ovarian cancer risk.

And if that were the case, then the observed association between HRT use and ovarian cancer would be completely spurious.

So what this study shows us is a correlation between HRT use and ovarian cancer, but as I’ve said many times before, correlation does not equal causation. I know I’ve been moaned at by journalists for endlessly repeating that fact, but I make no apology for it. It’s important, and I shall carry on repeating it until every story in the mainstream media about epidemiological research includes a prominent reminder of that fact.

Of course, it is certainly possible that HRT causes an increased risk of ovarian cancer. We just cannot conclude it from that study.

It would be interesting to look at how biologically plausible it is. Now, I’m no expert in endocrinology, but one little thing I’ve observed makes me doubt the plausibility. We know from a large randomised trial that HRT increases breast cancer risk (at least in the short term). There also seems to be evidence that oral contraceptives increase breast cancer risk but decrease ovarian cancer risk. With my limited knowledge of endocrinology, I would have thought the biological effects of HRT and oral contraceptives on cancer risk would be similar, so it just strikes me as odd that they would have similar effects on breast cancer risk but opposite effects on ovarian cancer risk. Anyone who knows more about this sort of thing than I do, feel free to leave a comment below.

But leaving aside the question of whether the results of the latest study imply a causal relationship (though of course we’re not really going to leave it aside, are we? It’s important!), I think there may be further problems with the study.

The paper tells us, and this was widely reported in the media, that “women who use hormone therapy for 5 years from around age 50 years have about one extra ovarian cancer per 1000 users”.

I’ve been looking at how they arrived at that figure, and it’s not totally clear to me how it was calculated. The crucial data in the paper is this table.  The table is given in a bit more detail in their appendix, and I’m reproducing the part of the table for 5 years of HRT use below.

 

 Age group  Baseline risk (per 1000)  Relative excess risk Absolute excess risk (per 1000)
 50-54  1.2  0.43  0.52
 55-59  1.6  0.23  0.37
 60-64  2.1  0.05  0.10
 Total  0.99

The table is a bit complicated, so some words or explanation are probably helpful. The baseline risk is the probability (per 1000) of developing ovarian cancer over a 5 year period in the relevant age group. The relative excess risk is the proportional amount by which that risk is increased by 5 years of HRT use starting at age 50. The absolute excess risk is the baseline risk multiplied by the relative excess risk.

The risk in each 5 year period is then added together to give the total excess lifetime risk of ovarian cancer for a woman who takes HRT for 5 years starting at age 50. I assume excess risks at older age groups are ignored as there is no evidence that HRT increases the risk after such a long delay. It’s important to note here that the figure of 1 in 1000 excess ovarian cancer cases refers to lifetime risk: not the excess in a 5 year period.

The figures for incidence seem plausible. The figures for absolute excess risk are correct if the relative excess risk is correct. However, it’s not completely clear where the figures for relative risk come from. We are told they come from figure 2 in the paper. Maybe I’m missing something, but I’m struggling to match the 2 sets of figures. The excess risk of 0.43 for the 50-54 year age group matches the relative risk 1.43 for current users with duration < 5 years (which will be true while the women are still in that age group), but I can’t see where the relative excess risks of 0.23 and 0.05 come from.

Maybe it doesn’t matter hugely, as the numbers in figure 2 are in the same ballpark, but it always makes me suspicious when numbers should match and don’t.

There are some further statistical problems with the paper. This is going to get a bit technical, so feel free to skip the next two paragraphs if you’re not into statistical details. To be honest, it all pales into insignificance anyway beside the more serious problem that correlation does not equal causation.

The methods section tells us that cases were matched with controls. We are not told how the matching was done, which is the sort of detail I would not expect to see left out of a paper in the Lancet. But crucially, a matched case control study is different to a non-matched case control study, and it’s important to analyse it in a way that takes account of the matching, with a technique such as conditional logistic regression. Nothing in the paper suggests that the matching was taken into account in the analysis. This may mean that the confidence intervals for the relative risks are wrong.

It also seems odd that the data were analysed using Poisson regression (and no, I’m not going to say “a bit fishy”). Poisson regression makes the assumption that the baseline risk of developing ovarian cancer remains constant over time. That seems a highly questionable assumption here. It would be interesting to see if the results were similar using a method with more relaxed assumptions, such as Cox regression. It’s also a bit fishy (oh damn, I did say it after all) that the paper tells us that Poisson regression yielded odds ratios. Poisson regression doesn’t normally yield odds ratios: the default statistic is an incidence rate ratio. Granted, the interpretation is similar to an odds ratio, but they are not the same thing. Perhaps there is some cunning variation on Poisson regression in which the analysis can be coaxed into giving odds ratios, but if there is, I’m not aware of it.

I’m not sure how much those statistical issues matter. I would expect that you’d get broadly similar results with different techniques. But as with the opaque way in which the lifetime excess risk was calculated, it just bothers me when statistical methods are not as they should be. It makes you wonder if anything else was wrong with the analysis.

Oh, and a further oddity is that nowhere in the paper are we told the total sample size for the analysis. We are told the number of women who developed ovarian cancer, but we are not told the number of controls that were analysed. That’s a pretty basic piece of information that I would expect to see in any journal, never mind a top-tier journal such as the Lancet.

I don’t know whether those statistical oddities have a material impact on the analysis. Perhaps they do, perhaps they don’t. But ultimately, I’m not sure it’s the most important thing. The really important thing here is that the study has not shown that HRT causes an increase in ovarian cancer risk.

Remember folks, correlation does not equal causation.

Hospital special measures and regression to the mean

Forgive me for writing 2 posts in a row about regression to the mean. But it’s an important statistical concept, which also happens to be widely misunderstood. Sometimes with important consequences.

Last week, I blogged about a claim that student tuition fees had not put off disadvantaged applicants. The research was flawed, because it defined disadvantage on the basis of postcode areas, and not on the individual characteristics of applicants. This means that an increase in university applications from disadvantaged areas could have simply been due to regression to the mean (ie the most disadvantaged areas becoming less disadvantaged) rather than more disadvantaged individual students applying to university.

Today, we have a story in the news where exactly the same statistical phenomenon is occurring. The story is that putting hospitals into “special measures” has been effective in reducing their death rates, according to new research by Dr Foster.

The research shows no such thing, of course.

The full report, “Is [sic] special measures working?” is available here. I’m afraid the authors’ statistical expertise is no better than their grammar.

The research looked at 11 hospital trusts that had been put into special measures, and found that their mortality rates fell faster than hospitals on average. They thus concluded that special measures were effective in reducing mortality.

Wrong, wrong, wrong. The 11 hospital trusts had been put into special measures not at random, but precisely because they had higher than expected mortality. If you take 11 hospital trusts on the basis of a high mortality rate and then look at them again a couple of years later, you would expect the mortality rate to have fallen more than in other hospitals simply because of regression to the mean.

Maybe those 11 hospitals were particularly bad, but maybe they were just unlucky. Perhaps it’s a combination of both. But if they were unusually unlucky one year, you wouldn’t expect them to be as unlucky the next year. If you take the hospitals with the worst mortality, or indeed the most extreme examples of anything, you would expect it to improve just by chance.

This is a classic example of regression to the mean. The research provides no evidence whatsoever that special measures are doing anything. To do that, you would need to take poorly performing hospitals and allocate them at random either to have special measures or to be in a control group. Simply observing that the worst trusts got better after going into special measures tells you nothing about whether special measures were responsible for the improvement.

Student tuition fees and disadvantaged applicants

Those of you who have known me for a while will remember that I used to blog on the now defunct Dianthus Medical website. The Internet Archive has kept some of those blogposts for posterity, but sadly not all of them. As I promised when I started this blog, I will get round to putting all those posts back on the internet one of these days, but I’m afraid I haven’t got round to that just yet.

But in the meantime, I’m going to repost one of those blogposts here, as it has just become beautifully relevant again. About this time last year, UCAS (the body responsible for university admissions in the UK) published a report which claimed to show that applications to university from disadvantaged young people  were increasing proportionately more than applications from the more affluent, or in other words, the gap between rich and poor was narrowing.

Sadly, the report showed no such thing. The claim was based on a schoolboy error in statistics.

Anyway, UCAS have recently published their next annual report. Again, this claims to show that the gap between rich and poor is narrowing, but doesn’t. Again, we see the same inaccurate headlines in the media that naively take the report’s conclusions at face value, and we see exactly the same schoolboy error in the way the statistics were analysed in the report.

So as what I wrote last year is still completely relevant today, here goes…

One of the most significant political events of the current Parliament has been the huge increase in student tuition fees, which mean that most university students now need to pay £9000 per year for their education.

One of the arguments against this rise used by its opponents was that it would put off young people from disadvantaged backgrounds from applying to university. Supporters of the new system argued that it would not, as students can borrow the money via a student loan to be paid back over a period of decades, so no-one would have to find the money up front.

The new fees came into effect in 2012, so we should now have some empirical data that should allow us to find out who was right. So what do the statistics show? Have people from disadvantaged backgrounds been deterred from applying to university?

A report was published earlier this year by UCAS, the organisation responsible for handling applications to university. This specifically addresses the question of applications from disadvantaged areas. This shows (see page 17 of the report) that although there was a small drop in application rates from the most disadvantaged areas immediately after the new fees came into effect, from 18.0% in 2011 to 17.5% in 2012, the rates have since risen to 20.5% in 2014. And the ratio of the rate of applications from the most advantaged areas to the most disadvantaged areas fell from 3.0 in 2011 to 2.5 in 2014.

So, case closed, then? Clearly the new fees have not stopped people from disadvantaged areas applying to university?

Actually, no. It’s really not that simple. You see, there is a big statistical problem with the data.

That problem is known as regression to the mean. This is a tendency of characteristics with particularly high or low values to become more like average values over time. It’s something we know all about in clinical trials, and is one of the reasons why clinical trials need to include control groups if they are going to give reliable data. For example, in a trial of a medication for high blood pressure, you would expect patients’ blood pressure to decrease during the trial no matter what you do to them, as they had to have high blood pressure at the start of the trial or they wouldn’t have been included in it in the first place.

In the case of the university admission statistics, the specific problem is the precise way in which “disadvantaged areas” and “advantaged areas” were defined.

The advantage or disadvantage of an area was defined by the proportion of young people participating in higher education during the period 2000 to 2004. Since the “disadvantaged” areas were specifically defined as those areas that had previously had the lowest participation rates, it is pretty much inevitable that those rates would increase, no matter what the underlying trends were.

Similarly, the most advantaged areas were almost certain to see decreases in participation rates (at least relatively speaking, though this is somewhat complicated by the fact that overall participation rates have increased since 2004).

So the finding that the ratio of applications from most advantaged areas to those from least advantaged areas has decreased was exactly what we would expect from regression to the mean. I’m afraid this does not provide evidence that the new tuition fee regime has been beneficial to people from disadvantaged backgrounds. It is very had to disentangle any real changes in participation rates from different backgrounds from the effects of regression to the mean.

Unless anyone can point me to any better statistics on university applications from disadvantaged backgrounds, I think the question of whether the new tuition fee regime has helped or hindered social inequalities in higher education remains open.

The Saatchi Bill

I was disappointed to see yesterday that the Saatchi Bill (or Medical Innovations Bill, to give it its official name) passed its third reading in the House of Lords.

The Saatchi Bill, if passed, will be a dreadful piece of legislation. The arguments against it have been well rehearsed elsewhere, so I won’t go into them in detail here. But briefly, the bill sets out to solve a problem that doesn’t exist, and then offers solutions that wouldn’t solve it even if it did exist.

It is based on the premise that the main reason no progress is ever made in medical research (which is nonsense to start with, of course, because progress made all the time) is because doctors are afraid to try innovative treatments in case they get sued. There is, however, absolutely no evidence that that’s true, and in any case, the bill would not help promote real innovation, as it specifically excludes the use of treatments as part of research. Without research, there is no meaningful innovation.

If the bill were simply ineffective, that would be one thing, but it’s also actively harmful. By removing the legal protection that patients  currently enjoy against doctors acting irresponsibly, the bill will be a quack’s charter. It would certainly make it more likely that someone like Stanislaw Burzynski, an out-and-out quack who makes his fortune from fleecing cancer patients by offering them ineffective and dangerous treatments, could operate legally in the UK. That would not be a good thing.

One thing that has struck me about the sorry story of the Saatchi bill is just how dishonest Maurice Saatchi and his team have been. A particularly dishonourable mention goes to the Daily Telegraph, who have been the bill’s “official media partner“. Seriously? Since when did bills going through parliament have an official media partner? Some of the articles they have written have been breathtakingly dishonest. They wrote recently that the bill had “won over its critics“,  which is very far from the truth. Pretty much the entire medical profession is against it: this response from the Academy of Royal Medical Colleges is typical. The same article says that one way the bill had won over its critics was by amending it to require that doctors treating patients under this law must publish their research. There are 2 problems with that: first, the law doesn’t apply to research, and second, it doesn’t say anything about a requirement to publish results.

In an article in the Telegraph today, Saatchi himself continued the dishonesty. As well as continuing to pretend that the bill is now widely supported, he also claimed that more than 18,000 patients responded to the Department of Health’s consultation on the bill. In fact, the total number of responses to the consultation was only 170.

The dishonesty behind the promotion of the Saatchi bill has been well documented by David Hills (aka “the Wandering Teacake”), and I’d encourage you to read his detailed blogpost.

The question that I want to ask about all this is why? Why is Maurice Saatchi doing all this? What does he have to gain from promoting a bill that’s going to be bad for patients but good for unscrupulous quacks?

I cannot know the answers to any of those questions, of course. Only Saatchi himself can know, and even he may not really know: we are not always fully aware of our own motivations. The rest of us can only speculate. But nonetheless, I think it’s interesting to speculate, so I hope you’ll bear with me while I do so.

The original impetus for the Saatchi bill came when Saatchi lost his wife to ovarian cancer. Losing a loved one to cancer is always difficult, and ovarian cancer is a particularly nasty disease. There can be no doubt that Saatchi was genuinely distressed by the experience, and deserves our sympathy.

No doubt it seemed like a good idea to try to do something about this. After all, as a member of the House of Lords, he has the opportunity to propose new legislation. It is completely understandable that if he thought a new law could help people who were dying of cancer, he would be highly motivated to introduce one.

All of that is very plausible and easy to understand. What has happened subsequently, however, is a little harder to understand.

It can’t have been very long after Saatchi proposed the bill that many people who know more about medicine than he does told him why it simply wouldn’t work, and would have harmful consequences. So I think what is harder to understand is why he persisted with the bill after all the problems with it had been explained to him.

It has been suggested that this is about personal financial gain: his advertising company works for various pharmaceutical companies, and pharmaceutical companies will gain from the bill.

However, I don’t believe that that is a plausible explanation for Saatchi’s behaviour.

For a start, I’m pretty sure that the emotional impact of losing a beloved wife is a far stronger motivator than money, particularly for someone who is already extremely rich. It’s not as if Saatchi needs more money. He’s already rich enough to buy the support of a major national newspaper and to get a truly dreadful bill through parliament.

And for another thing, I’m not at all sure that pharmaceutical companies would do particularly well out of the bill anyway. They are mostly interested in getting their drugs licensed so that they can sell them in large quantities. Selling them as a one-off to individual patients is unlikely to be at the top of their list of priorities.

For what it’s worth, my guess is that Saatchi just has difficulty admitting that he was wrong. It’s not a particularly rare personality trait. He originally thought the bill would genuinely help cancer patients, and when told otherwise, he simply ignored that information. You might see this as an example of the Dunning Kruger effect, and it’s certainly consistent with the widely accepted phenomenon of confirmation bias.

Granted, what we’re seeing here is a pretty extreme case of confirmation bias, and has required some spectacular dishonesty on the part of Saatchi to maintain the illusion that he was right all along. But Saatchi is a politician who originally made his money in advertising, and it would be hard to think of 2 more dishonest professions than politics and advertising. It perhaps shouldn’t be too surprising that dishonesty is something that comes naturally to him.

Whatever the reasons for Saatchi’s insistence on promoting the bill in the face of widespread opposition, this whole story has been a rather scary tale of how money and power can buy your way through the legislative process.

The bill still has to pass its third reading in the House of Commons before it becomes law. We can only hope that our elected MPs are smart enough to see what a travesty the bill is. If you want to write to your MP to ask them to vote against the bill, now would be a good time to do it.

Plain packaging for tobacco

Plain packaging for tobacco is in the news today. The idea behind it is that requiring tobacco manufacturers to sell cigarettes in unbranded packages, where all the branding has been replaced by prominent health warnings, will reduce the number of people who smoke, and thereby benefit public health.

But will it work?

That’s an interesting question. There’s a lot of research that’s been done, though it’s fair to say none of it is conclusive. For example, there has been research on how it affects young people’s perceptions of cigarettes and on what happened to the number of people looking for help with quitting smoking after plain packaging was introduced in Australia.

But for me, those are not the most interesting pieces of evidence.

What tells me that plain packaging is overwhelmingly likely to be an extremely effective public health measure is that the tobacco industry are strongly opposed to it. They probably know far more about the likely effects than the rest of us: after all, for me, it’s just a matter of idle curiosity, but for them, millions of pounds of their income depends on it. So the fact they are against it tells us plenty.

Let’s look in a little more detail at exactly what it tells us. Advertising and branding generally has 2 related but distinguishable aims for a company that sells something. One aim is to increase their share of the market, in other words to sell more of their stuff than their competitors in the same market. The other is to increase the overall size of the market, so that they sell more, and their competitors sell more as well. Both those things can be perfectly good reasons for a company to spend their money on advertising and branding.

But the difference between those 2 aims is crucial here.

If the point of cigarette branding were just to increase market share without affecting the overall size of the market, then the tobacco industry should be thoroughly in favour of a ban. Advertising and branding budgets, when the overall size of the market is constant, are a classic prisoner’s dilemma. If all tobacco companies spend money on branding, they will all have pretty much the same share as if no-one did, so they will gain nothing, but they will spend money on branding, so they’re worse off than if they didn’t. However, they can’t afford not to spend money on branding, as then they would lose market share to their competitors, who are still spending money on it.

The ideal situation for the tobacco industry in that case would be that no-one would spend any money on branding. But how can you achieve that? For all the companies to agree not to spend money on branding might be an illegal cartel, and there’s always a risk that someone would break the agreement to increase their market share.

A government-mandated ban solves that problem nicely. If all your competitors are forced not to spend money on branding, then you don’t have to either. All the tobacco companies win.

So if that were really the situation, then you would expect the tobacco companies to be thoroughly in favour of it. But they’re not. So that tells me that we are not in the situation where the total market size is constant.

The tobacco companies must believe, and I’m going to assume here that they know what they’re doing, that cigarette branding affects the overall size of the market. If branding could increase the overall size of the market (or more realistically when smoking rates in the UK are on a long-term decline, stop it shrinking quite as fast), then it would be entirely rational for the tobacco companies to oppose mandatory plain packaging.

I don’t know about you, but that’s all the evidence I need to convince me that plain packaging is overwhelmingly likely to be an effective public health measure.

Volunteers wanted for research into homeopathy

I am planning a research project to explore the experiences of people who have used homeopathy, and if you have used homeopathy yourself then I would be really grateful if you would consider taking part in my research.

I would like to interview people who have used homeopathy, have been pleased with the results, but have encountered negative reactions to your use of homeopathy from others. Perhaps your GP has advised you not to use homeopathy, perhaps friends or family have told you that you were wasting your time, or perhaps you got into an argument with someone on the internet. It doesn’t matter who reacted negatively to your use of homeopathy: I am interested in learning about how users of homeopathy experience negative reactions in general.

The research will take the form of a short interview of about 30 minutes (which can take place in a location of your choice), in which I will ask you about your experiences and how they seemed to you. I shall be using a phenomenological approach to the research, which means that I am interested in learning about your experiences from your own point of view, rather than trying to fit them into a pre-existing theory.

This research is part of a degree in social sciences that I am doing with the Open University. Specifically, it is part of a module in social psychology.

In the interests of transparency, I should tell you that I am sceptical of the benefits of homeopathy. However, my intention in this research is not to challenge your views about homeopathy, it is to come to a better understanding of them.

If you decide you want to take part but later change your mind, that is fine. In that case, any materials from your interview would be returned to you or destroyed, as you prefer, and your interview would not be used in my research.

Please be assured that your participation in the study would be kept strictly confidential.

If you are interested in taking part in my research or if you would just like to know more about about the project, please feel free to contact me.

Thank you for taking the time to read this.

Adam Jacobs

Update 29 January 2015:

Many thanks to everyone who volunteered for this project. I now have enough data, and so I no longer need any more volunteers.

Are two thirds of cancers really due to bad luck?

A paper published in Science has been widely reported in the media today. According to media reports, such as this one, the paper showed that two thirds of cancers are simply due to bad luck, and only one third are due to environmental, lifestyle, or genetic risk factors.

The paper shows no such thing, of course.

It’s actually quite an interesting paper, and I’d encourage you to read it in full (though sadly it’s paywalled, so you may or may not be able to). But it did not show that two thirds of cancers are due to bad luck.

What the authors did was they looked at the published literature on 31 different types of cancer (eg lung cancer, thyroid cancer, colorectal cancer, etc) and estimated 2 quantities for each type of cancer. They estimated the lifetime risk of getting the cancer, and how often stem cells divide in those tissues.

They found a very strong correlation between those two quantities: tissues in which stem cells divided frequently (eg the colon) were more likely to develop cancer than tissues in which stem cell division was less frequent (eg the brain).

The correlation was so strong, in fact, that it explained two thirds of the variation among different tissue types in their cancer incidence. The authors argue that because mutations that can lead to cancer can occur during stem cell division purely by chance, that means that two thirds of the variation in cancer risk is due to bad luck.

So, that explains where the “two thirds” figure comes from.

The problem is that it applies only to explaining the variation in cancer risk from one tissue to another. It tells us nothing about how much of the risk within a given tissue is due to modifiable factors. You could potentially see exactly the same results whether each specific type of cancer struck completely at random or whether each specific type were hugely influenced by environmental risk factors.

Let’s take lung cancer as an example. Smoking is a massively important risk factor. Here’s a study that estimated that over half of all lung cancer deaths in Japanese males were due to smoking. Or to take cervical cancer as another example, about 70% of cervical cancers are due to just 2 strains of the HPV virus.

Those are important statistics when considering what proportion of cancers are just bad luck and what proportion are due to modifiable risk factors, but they did not figure anywhere in the latest analysis.

So in fact, interesting though this paper is, it tells us absolutely nothing about what proportion of cancer cases are due to modifiable risk factors.

We often see medical research badly reported in the newspapers. Often it doesn’t matter very much. But here, I think real harm could be done. The message that comes across from the media is that cancer is just a matter of luck, so changing your lifestyle won’t make much difference anyway.

We know that lifestyle is hugely important not only for cancer, but for many other diseases as well. For the media to claim give the impression that lifestyle isn’t important, based on a misunderstanding of what the research shows, is highly irresponsible.

Edit 5 Jan 2015:

Small correction made to the last paragraph following discussion in the comments below. Old text in strikethrough, new text in bold.

Detox: it’s all a con

At this time of year you will no doubt see many adverts for “detox” products. It’s a nice idea. Most of us have probably eaten and drunk rather more than we should have done over the last week or so. Wouldn’t it be nice if we could buy some nice helpful thing that would “flush all the toxins out of our system”?

This one is pretty typical. It claims to “cleanse the body from inside out”. There’s just one problem with this claim, and indeed with the claims of any other detox product you care to mention: it’s total bollocks.

Let me explain with this handy diagram:

Detox

 

There may well be things in our system that would be better off not in our system. Alcohol immediately springs to mind. But here’s the thing: millions of years of evolution have given us a liver and a pair of kidneys which, between them, do a remarkably good job of ridding the body of anything that shouldn’t be in it.

There is no scientific evidence whatsoever that any “detox” product will provide even the slightest improvement on your liver and kidneys.

If someone tries to sell you a detox product, perhaps you could ask which specific toxins it helps to remove. I have never seen that specified, but surely that is the first step to being able to show whether it works or not.

And then, in the unlikely event that this snake-oil detox salesman does tell you which toxin(s) the product is supposed to remove, ask for the evidence that it does. I guarantee you that you will not get a sensible answer.

So if your new year’s resolution is to “detox” yourself, then that’s great. Eat a healthy balanced diet, don’t drink too much alcohol, take plenty of exercise, and don’t smoke. But any money you spend on “detox” products will be 100% wasted.

Happy new year.

Does peer review fail to spot outstanding research?

A paper by Siler et al was published last week which attracted quite a bit of attention among those of us who take an interest in scientific publishing and the peer review process. It looked at the citation count of papers that had been submitted to 3 high-impact medical journals and subsequently published, either in one of those 3 journals or in another journal if rejected by one of the 3.

The accompanying press release from the publisher told us that “scientific peer review may have difficulties identifying unconventional and/or outstanding work”. This wasn’t too far off what was claimed in the paper, where Siler et al concluded that their work suggested that peer review “had difficulties in identifying outstanding or breakthrough work”.

The press release was reported uncritically by several organisations that should have known better, including Science, Nature,  and Retraction Watch.

It’s an interesting theory. The theory goes that peer reviewers don’t like to get out of their comfort zone, and while they may give good reviews to small incremental advances in their field, they don’t like radical new research that breaks new ground, so such research may be rejected.

The only problem with this theory is that Siler et al’s paper provides absolutely no data to support it.

Let’s look at what they did. They looked at 1008 manuscripts that were submitted to 3 top-tier medical journals (Annals of Internal Medicine, British Medical Journal, and The Lancet). Most of those papers were rejected, but subsequently published in other journals. Siler et al tracked the papers to see how many times each paper was cited.

Now, there we have our first problem. Using the number of times a paper is cited as a measure of groundbreaking research is pretty crude. Papers can be highly cited for many reasons, and presenting groundbreaking research is only one of them. I am writing this blogpost on the same day that I found that the 6th most important paper of the year according to “Altmetrics” (think of it as citation counting for the Facebook generation), was about how long it takes for boxes of chocolates on hospital wards to be eaten. A nicely conducted and amusing piece of research, to be sure, but hardly breaking new frontiers in science.

There’s also something rather fishy about the numbers of citations reported in the paper. The group of papers with the lowest citation rate reported in the paper were cited an average of 69.8 times each. That’s an extraordinarily high number. Of the 3 top-tier journals studied, The Lancet has the highest impact factor, at 39.2. That means that papers in The Lancet are cited an average of 39.2 times each. Doesn’t it seem rather odd that papers rejected from it are cited almost twice as often? I’m not sure what to make of that, but it does make me wonder if there is a problem with data quality.

Anyway, the main piece of evidence used to support the idea that peer review was bad at recognising outstanding research is that the 14 most highly cited papers of the 1008 papers examined were rejected by the 3 top journals. The first problem with that is that 12 of those 14 were rejected by the journals’ in-house editorial staff without being sent for peer review. So even if there were no further problems with the paper, we couldn’t draw any conclusions about failings of peer review: the failings would be down to journals’ in-house staff.

Another problem is that those 14 papers were not, of course, rejected by the peer review system. They were all published in peer reviewed journals: just not the first journal that the authors tried. So we really can’t conclude that peer review is preventing groundbreaking work from being published.

But in any case, if we ignore those flaws and ask ourselves is it still not true that groundbreaking (or at least highly cited) research is being rejected, I think we’d want to know that the highly cited research is more likely to be rejected than other research.

And I’m afraid the evidence for that is totally lacking.

Rejecting the top 14 papers sounds bad. But it’s important to realise that the overall rejection rate was very high: only 6.2% of the papers submitted were accepted. If the probability of accepting each of the top 14 papers was 6.2%, like all the others, then there is about a 40% chance that all 14 of them would be rejected. And that is ignoring the fact that looking specifically at the top 14 papers is a post-hoc analysis. The only robust way to see if the more highly cited papers were more likely to be rejected would have been to specify a specific hypothesis in advance, rather than to focus on what came out of the data as being the most impressive statistic.

So, to recap, this paper used a crude measure of whether papers were groundbreaking, did not look at what peer reviewers thought of them, found precisely zero high impact articles that were rejected by the peer review system, and found no evidence whatsoever that high-impact articles were more likely to be rejected than any others.

Call me a cynic if you like, but I’m not convinced. The peer review process is not perfect, of course, But if you want to convince me that one of its flaws is that it is biased against groundbreaking research, you’re going to have to come up with better evidence than Siler et al’s paper.

 

Clinically proven

My eye was caught the other day by this advert:

Boots

Quite a bold claim, I thought. “Defends against cold and flu” would indeed be impressive, if it were true. Though I also noticed the somewhat meaningless verb “defend”. What does that mean exactly? Does it stop you getting a cold or flu in the first place? Or does it just help you recover faster if you get a cold or flu?

I had a look at the relevant page on the Boots website to see if I could find out more. It told me

“Boots Pharmaceuticals Cold & Flu Defence Nasal Spray is an easy to use nasal spray with antiviral properties containing clinically proven Carragelose to defend against colds and flu, as well as help shorten the duration and severity of both colds and flu.”

It then went on to say

“Use three times a day to help prevent a cold or flu, or several times a day at the first signs helping reduce the severity and duration of both colds and flu.”

OK, so Boots obviously want us to think that it can do both: prevent colds and flu and help treat them.

So what is the evidence? Neither the advert nor the web page had any links to any of the evidence backing up the claim that these properties were “clinically proven”. So I tweeted to Boots to ask them.

To their credit, Boots did reply to me (oddly by direct message, in case you’re wondering why I’m not linking to their tweets) with 4 papers in peer reviewed journals.

So how does the evidence stack up?

Well, the first thing to note is that although there were 4 papers, there were only 3 clinical trials: one of the papers is a combined analysis of 2 of the others. The next thing to note is that all 3 trials were of patients in the early stages of a common cold. So right away we can see that we have no evidence whatsoever that the product can help prevent a cold or flu, and no evidence whatsoever that it can treat flu.

The “clinically proven” claim is starting to look at little shaky.

But can it at least treat a common cold? That would be pretty impressive if it could. The common cold has proved remarkably resilient to anything medical science can throw at it. A treatment that actually worked against the common cold would indeed be good news.

The first of the trials was published in 2010. It was an exploratory study in 35 patients who were in the first 48 hours of a cold, but otherwise healthy. It was randomised and double-blind, and as far as I can tell from the paper, seems to have been reasonably carefully conducted. The study showed a significant benefit of the nasal spray on the primary outcome measure, namely the average of a total symptom score on days 2 to 4 after the start of dosing.

Well, I say significant. It met the conventional level of statistical significance, but only just, at P = 0.046 (that means that there’s about a 1 in 20 chance you could have seen results like this if the product were in fact completely ineffective: not a particularly high bar). The size of the effect also wasn’t very impressive: the symptom score was 4.6 out of a possible 24 in the active treatment group and 6.3 in the placebo group. Not only that, but it seems symptom scores were higher in the placebo group at baseline as well, and no attempt was made to adjust for that.

So not wholly convincing, really. On the other hand, the study did show quite an impressive effect on the secondary outcome of viral load, with a 6-fold increase from baseline to day 3 or 4 in the placebo group, but a 92% decrease in the active group. This was statistically significant at P = 0.009.

So we have some preliminary evidence of efficacy, but with such a small study and such unconvincing results on the primary outcome of symptoms, I think we’re going to have to do a lot better.

The next study was published in 2012, and included children (ages 1 to 18 years) in the early stages of a common cold. It was also randomised and double blind. The study randomised 213 patients, but only reported efficacy data for 153 of them, so that’s not a good start. It also completely failed to show any difference between the active and placebo treatments on the primary outcome measure, the symptom score from days 2 to 7. Again, there was a significant effect on viral load, but given the lack of an effect on the symptom score, it’s probably fair to say the product doesn’t work very well, if at all, in children.

The final study was published in 2013. It was again randomised and double blind, and like the first study included otherwise healthy adults in the first 48 h of a common cold. The primary endpoint was different this time, and was the duration of disease. This was a larger study than the first one, and included 211 patients.

The results were far from impressive. One of the big problems with this study was that they restricted their efficacy analysis to the subset of 118 patients with laboratory confirmed viral infection. Losing half your patients from the analysis like this is a huge problem. If you have a cold and are tempted to buy this product, you won’t know whether you have laboratory confirmed viral infection, so the results of this study may not apply to you.

But even then, the results were distinctly underwhelming. The active and placebo treatments were only significantly different in the virus-positive per-protocol population, a set of just 103 patients: less than half the total number recruited. And even then, the results were only just statistically significant, at P = 0.037. The duration of disease was reduced from 13.7 days in the placebo group to 11.6 days in the active group.

So, do I think that Boots Cold and Flu Defence is “clinically proven”? Absolutely not. There is no evidence whatsoever that it prevents a cold. There is no evidence whatsoever that it either prevents or treats flu.

There is some evidence that it may help treat a cold. It’s really hard to know whether it does or not from the studies that have been done so far. Larger studies will be needed to confirm or refute the claims. If it does help to treat a cold, it probably doesn’t help very much.

The moral of this story is that if you see the words “clinically proven” in an advert, please be aware that that phrase is completely meaningless.