## Tuesday, 8 November 2016

### Confusion over the Likelihood Ratio

7 Jan 2017: There is an update to this post here.

The 'Likelihood Ratio' (LR) has been dominating discussions at the third workshop  in our Isaac Newton Institute Cambridge Programme Probability and Statistics in Forensic Science.
There have been many fine talks on the subject - and these talks will be available here for those not fortunate enough to be attending.

We have written before (see links at bottom) about some concerns with the use of the LR. For example, we feel there is often a desire to produce a single LR even when there are multiple different unknown hypotheses and dependent pieces of evidence (in such cases we feel the problem needs to be modelled as a Bayesian network)- see [1]. Based on the extensive discussions this week, I think it is worth recapping on another one of these concerns (namely when hypotheses are non-exhaustive).

To recap: The LR  is a formula/method that is recommended for use by forensic scientists when presenting evidence - such as the fact that DNA collected at a crime scene is found to have a profile that matches the DNA profile of a defendant in a case. In general, the LR is a very good and simple method for communicating the impact of evidence (in this case on the hypothesis that the defendant is the source of the DNA found at the crime scene).

To compute the LR, the forensic expert is forced to consider the probability of finding the evidence under both the prosecution and defence hypotheses. So, if the prosecution hypothesis Hp is "Defendant is the source of the DNA found" and the defence hypothesis Hp is "Defendant is not the source of the DNA found" then we compute both the probability of the evidence given Hp - written P(E | Hp) - and the probability of the evidence given Hd - written P(E | Hd). The LR is simply the ratio of these two likelihoods, i.e. P(E | Hp) divided by P(E | Hd).

The very act of considering both likelihood values is a good thing to do because it helps to avoid common errors of communication that can mislead lawyers and juries (notably the prosecutor's fallacy). But, most importantly, the LR is a measure of the probative value of the evidence. However, this notion of probative value is where misunderstandings and confusion sometimes arise. In the case where the defence hypothesis is the negation of the prosecution hypothesis (i.e. Hd is the same as "not Hp" as in our example above) things are clear and very powerful because, by Bayes theorem:
• when the LR is greater than one the evidence supports the prosecution hypothesis (increasingly for larger values) - in fact the posterior odds of the prosecution hypothesis increase by a factor of LR over the prior odds.
• when the LR is less than one it supports the defence hypothesis (increasingly as the LR gets closer to zero) -  the posterior odds of the defence hypothesis increase by a factor of LR over the prior odds.
• when the LR is equal to one then the evidence supports neither hypothesis and so is 'neutral' - the posterior odds of both hypotheses are unchanged from their prior odds. In such cases, since the evidence has no probative value lawyers and forensic experts believe it should not be admissible.
However, things are by no means as clear and powerful when the hypotheses are not exhaustive (i.e. the negation of each other) and in most forensic applications this is the case. For example, in the case of DNA evidence, while the prosecution hypothesis Hp is still "defendant is source of the DNA found" in practice the defence hypothesis Hd is often something like "a person unrelated to the defendant is the source of the DNA found".

In such circumstances the LR can only help us to distinguish between which of the two hypotheses is more likely, so, e.g.  when the LR is greater than one the evidence supports the prosecution hypothesis over the defence hypothesis (with larger values leading to increased support). However, unlike the case for exhaustive hypotheses, the LR tells us nothing about the change in odds of the prosecution hypothesis. In fact, it is quite possible that the LR can be very large - i.e. strongly supporting the prosecution hypothesis over the defence hypothesis - even though the posterior probability of the prosecution hypothesis goes down.  This rather worrying point is not understood by all forensic scientists (or indeed by all statisticians). Consider the following example (it's a made-up coin tossing example, but has the advantage that the numbers are indisputable):
Fred claims to be able to toss a fair coin in such a way that about 90% of the time it comes up Heads. So the main hypothesis is
H1: Fred has genuine skill
To test the hypothesis, we observe him toss a coin 10 times. It comes out Heads each time. So our evidence E is 10 out of 10 Heads. Our alternative hypothesis is:
H2: Fred is just lucky.

By Binomial theorem assumptions, P(E | H1) is about 0.35 while P(E | H2) is about 0.001. So the LR is about 350, strongly in favour of H1.

However, the problem here is that H1 and H2 are not exhaustive. There could be another hypotheses H3: "Fred is cheating by using a double-headed coin". Now, P(E | H3) = 1.

If we assume that H1, H2 and H3 are the only possible hypotheses* (i.e. they are exhaustive) and that the priors are equally likely, i.e. each is equal to 1/3 then the posteriors after observing the evidence E are:

H1: 0.25907     H2: 0.00074        H3: 0.74019

So, after observing the evidence E, the posterior for H1 has actually decreased despite the very large LR in its favour over H2.
In the above example, a good forensic scientist - if considering only H1 and H2 - would conclude by saying something like
"The evidence shows that hypothesis H1 is 350 times more likely than H2, but tells us nothing about whether we should have greater belief in H1 being true; indeed, it is possible that the evidence may much more strongly support some other hypothesis not considered and even make our belief in H1 decrease".
However, in practice (and I can confirm this from having read numerous DNA and other forensic case reports) no such careful statement is made. In fact, the most common assertion used in such circumstances is:
"The evidence provides strong support for hypothesis H1"
Such an assertion is not only mathematically wrong but highly misleading. Consider, as discussed above, a DNA case where:

Hp is "defendant is source of the DNA found"
Hd is  "a person unrelated to the defendant is the source of the DNA found".

This particular Hd hypothesis is a common convenient choice for the simple reason that P(E | Hd) is relatively easy to compute (it is the 'random match probability'). For single-source, high quality DNA this probability can be extremely small - of the order of one over several billions; since P(E | Hp) is equal to 1 in this case the LR is several billions. But, this does NOT provide overwhelming support for Hp as is often assumed unless we have been able to rule out all relatives of the defendant as suspects. Indeed, for less than perfect DNA samples it is quite possible for the LR  to be in the order of millions but for a close relative to be a more likely source than the defendant.

While confusion and misunderstandings can and do occur as a result of using hypotheses that are not exhaustive, there are many real examples where the choice of such non-exhaustive hypotheses is actually negligent.  The following worrying example is based on a real case (location details changed as an appeal is ongoing):
The suspect is accused of committing a crime in a particular rural location A near his home village in Dorset. The evidence E is soil found on the suspect's car.  The prosecution hypothesis Hp is "the soil comes from A". The suspect lives (and drives) near this location but claims he did not drive to that specific spot. To 'test' the prosecution hypothesis a soil expert compares Hp with the hypothesis Hd: "the soil comes from a different rural location". However, the 'different rural location'  B happens to be 500 miles away in Perth Scotland (simply because it is close to where the soil analyst works and he assumes soil from there is 'typical' of rural soil). To carry out the test the expert considers soil profiles of E and samples from the two sites A and B.

Inevitably the LR strongly favours Hp (i.e. site A)  over Hd (i.e. site B); the soil profile on the car - even if it was never at location A - is going to be much closer to the A profile than the B profile. But we can conclude absolutely nothing about the posterior probability of A. The LR is completely useless - it tells us nothing other than the fact that the car was more likely to have been driven in the rural location in Dorset than in a a rural location in Perth. Since the suspect had never driven the car outside Dorset this is hardly a surprise.  Yet, in the case this soil evidence was considered important since it was wrongly assumed to mean that it "provided support for the prosecution hypothesis".
This example also illustrates, however, why in practice it can be impossible to consider exhautive hypotheses. For such soil cases, it would require us to consider samples from every possible 'other' location. What an expert like Pat Wiltshire (who is also a participant on the FOS programme) does is to choose alternative sites close to the alleged crime scene and compare the profile of each of those and the crime scene profile with the profile from the suspect. While this does not tell us if the suspect was at the crime scene it can tell us how much more likely the suspect was to have been there rather than sites nearby.

*as pointed out by Joe Gastwirth there could be other hypotheses like "Fred uses the double-headed coin but switches to a regular coin after every 9 tosses".

References
1. Fenton N.E, Neil M, Berger D, “Bayes and the Law”, Annual Review of Statistics and Its Application, Volume 3, 2016 (June), pp 51-77 http://dx.doi.org/10.1146/annurev-statistics-041715-033428 .Pre-publication version here and here is the Supplementary Material See also blog posting.
2. Fenton, N. E., D. Berger, D. Lagnado, M. Neil and A. Hsu, (2013). "When ‘neutral’ evidence still has probative value (with implications from the Barry George Case)", Science and Justice, http://dx.doi.org/10.1016/j.scijus.2013.07.002.  A pre-publication version of the article can be found here.