Monday, 31 August 2015

Doctoring Data: Review of a must read book

Book Review:

Review by Norman Fenton and Martin Neil (a pdf version of this article can be found here)

This is an extremely important (and also entertaining) book that should be mandatory reading not just for anybody interested in finding out about what data-driven medical studies really mean, but also for anybody engaged in any kind of empirical work. What Kendrick shows brilliantly is the extent to which the vast majority of medical recommendations and guidelines are based on data-driven studies that are fundamentally flawed and often corrupt. He highlights how the resulting recommendations and guidelines have led (world-wide) to millions of unnecessarily early deaths, millions of people suffering unnecessary pain, and widespread use of drugs and treatments that do more harm than good (example: statins), as well as wasting billions of taxpayer dollars every year.

As researchers who have been involved in empirical studies in a very wide range of disciplines over many years we believe that much of what he says is also relevant to all of these disciplines (which include most branches of the physical, natural and environmental sciences, computer science, the social sciences, and law). Apart from the cases of deliberate corruption and bias (of which Kendrick provides many medical examples) most of the flaws boil down to a basic misunderstanding of statistics, probability and the scientific method.

There are two notable quotes that Kendrick uses, which we believe sum up most of the problems he identifies:
  1. When a man finds a conclusion agreeable, he accepts it without argument, but when he finds it disagreeable, he will bring against it all the forces of logic and reason” Thucydides.
  2.  “I know that most men, including those at ease with problems of the greatest complexity, can seldom accept even the simplest and most obvious truth if it be such as would oblige them to admit the falsity of conclusions which they have delighted in explaining to colleagues, which they have proudly taught to others, and which they have woven, thread by thread, into the fabric of their lives.” Leo Tolstoy
The first sums up the extent to which results of empirical work are doctored to suit the pre-conceived biases and hopes of those undertaking it (a phenomenon also known as ‘confirmation bias’). The second sums up the extent to which there are ideas that represent the ‘accepted orthodoxy’ in most disciplines that are impossible to challenge even when they are wrong. Those brave enough to challenge the accepted orthodoxy risk ruining their careers in their discipline. Hence, most researchers and practitioners simply accept the orthodoxy without question and help perpetuate flawed or useless ideas in order to get funding and progress their careers. Kendrick describes how these problems lie at the heart of the fundamentally fraudulent peer review system in medicine – which applies to both submitting articles to journals and submitting research grant applications. Once again, we believe that all of the areas of research where we have worked (maths, computer science, forensics, law, and AI) suffer from the same flawed peer review system.

Kendrick is not afraid to challenge the leading figures in medicine, often exposing examples of hypocrisy and corruption. Of special interest to us, however, is that he also challenges the attitude of revered figures in our own discipline. For example, Kendrick highlights two quotes in a recent article by Nobel prize-winner Daniel Kahneman, whose work in the psychology of decision theory and risk is held in the highest esteem.:
  1.  “The way scientists try to convince people is hopeless because they present evidence, figures, tables, arguments, and so on. But that’s not how to convince people. People aren’t convinced by arguments, they don’t believe conclusions because they believe in the arguments that they read in favour of them. They’re convinced because they read or hear the conclusions from people they trust. You trust someone and you believe what they say. That’s how ideas are communicated. The arguments come later.”
  2.  “Why do I believe global warming is happening? The answer isn’t that I have gone through all the arguments and analysed the evidence – because I haven’t. I believe the experts from the Academy of Sciences. We all have to rely on experts.
Kendrick notes the problem here:
“In one breath he states that people aren’t convinced by arguments; they’re convinced because they read or hear conclusions from people they trust. Then he says that we all have to rely on experts. But he does not link these two thoughts together to ask the obvious question. Just how, exactly, did the experts come to their conclusions?”
Having presented the BBC documentary on Climate Change by Numbers we also got an insight into the extent to which problems exist there.

As good as the book is (and indeed because of how good it is), we feel the need to highlight some points where we believe Kendrick gets it wrong. There are some statistical/probability errors and over-simplifications, which mostly seem to stem from a lack of awareness of Bayesian probability. For example, he says:
“… although association cannot prove causation, a lack of association does disprove causation”.
This is not true as can be proven by the simple counter example we provide below using a Bayesian network*.

Next we believe Kendrick’s faith in randomised control trials (RCTs) as being the (only) reliable empirical basis for medical decision making is misplaced. Because of Simpson’s paradox and the impossibility of accounting for all confounding variables there is, in principle, no solid basis for believing that the result of any RCT is ‘correct’. As is shown in the article here it is possible, for example, that an RCT can find a drug to be effective compared to a placebo in every possible categorisation of trial participants, yet the addition of a single confounding variable can result in an exact reversal of the results.

So, if we are saying that even RCTs cannot be accepted as valid empirical evidence, does that mean that we are even more pessimistic than Kendrick about the possibility of any useful empirical research? No - and this brings us to our final major area of disagreement with Kendrick’s thesis. In contrast to what Kendrick proposes we believe there is an important role for expert judgment in critical decision-making. In fact, we believe expert judgement is inevitable even if every attempt is made to remove it from an empirical study (it is, for example, impossible to remove expert judgment from the very problem of framing the study and choosing the variables and data to collect). Given the inevitability of expert judgment, we feel it should be made obvious, transparent, and open to refutation by experiment. Any scientist should be as open and honest about their judgment as possible and be prepared to make predictions and be contradicted by data.

By combining expert judgment with data it is possible to get far more reliable empirical results with much less data and effort than required for an RCT. This is essentially what we proposed in our book and which is being further developed in the EU project BayesKnowledge.

*Refuting the assertion “If there is no association (correlation) then there cannot be causation”.

Consider the two hypotheses:
  • H1: “If there is no association (correlation) then there cannot be causation”.
  • H2: “If there is causation there must be association (correlation).
Kendrick’s assertion (H1) is, of course, equivalent to H2. We can disprove H2 with a simple counter-example using two Boolean variables a, and b, i.e. whose states are True or False. We do this by introducing a third, latent, unobserved Boolean variable c. Specifically we define the relationship between a,b, and c via the following Bayesian network :

By definition b is completely causally dependent on a. This is because, when c is True the state of b will be the same as the state of a, and when c is False the state of b will be the opposite of the state of a.

However, suppose - as in many real-world situations – that c is both hidden and unobserved (i.e. a typical confounding variable). Also, assume that the priors for the variables a and c are uniform (i.e. 50% of the time they are False and 50% of the time they are True).

Then when a is False there is a 50% chance b is False and a 50% chance b is True. Similarly, when a is True there is a 50% chance b is False and a 50% chance b is True. In other words, what we actually observe is zero association (correlation) despite the underling mechanism being completely (causally) deterministic.

The above BN model can be downloaded here and run using the free version of AgenaRisk

Sunday, 30 August 2015

Using Bayesian networks to assess and manage risk of violent reoffending among prisoners

Fragment of BN model
Probation officers, clinicians, and forensic medical practitioners have for several years sought improved decision support for determining whether and when to release prisoners with mental health problems and a history of violence.  It is critical that the risk of violent re-offending is accurately measured and, more importantly, well managed with causal interventions to reduce this risk after release. The well-established 'risk predictors' in this area of research are typically based on statistical regression models and their results are less than convincing. But recent work undertaken at Queen Mary University of London has resulted in Bayesian network (BN) models that not only have much greater accuracy, but which are also much more useful for decision support. The work has been developed as part of a collaboration between the Risk and Information Management group and the medical practitioners of the Violence Prevention Research Unit (VPRU) of the Wolfson Institute of Preventative Medicine.

The (BN) model, called DSVM-P (Decision Support for Violence Management – Prisoners) captures the causal relationships between risk factors, interventions and violence.  It also allows for specific risk factors to be targeted for causal intervention for risk management of future re-offending. These decision support features are not available in the previous generation of models used by practitioners and forensic psychiatrists.

Full reference:
Constantinou, A., Freestone M., Marsh, W., Fenton, N. E. , Coid, J. (2015) "Risk assessment and risk management of violent reoffending among prisoners", Expert Systems With Applications 42 (21), 7511-7529.  Published version:
Download Pre-publication draft.