Bayes Knowledge: Smart data – not big data: 2016

Friday 18 November 2016

Researcher Dr Anthony Constantinou has his identify 'stolen' in the name of convicted multi-millionaire sex molester Anthony Constantinou

Dr Anthony Constantinou is a researcher in Bayesian AI methods at Queen Mary University of London. He currently works on the ERC-funded BAYES-KNOWLEDGE project led by Prof Norman Fenton and has been a recent visitor at the Isaac Newton Institute University of Cambridge Programme Probability and Statistics in Forensic Science.

While this Anthony Constantinou is well respected within the AI research community - and has also gained a strong reputation for his work in applying Bayesian methods to football prediction - there is another much more well known Anthony Constantinou, namely the multi-millionaire son of tycoon Aristos Constantinou who was murdered at his luxury home in the Bishops Avenue London in 1985. After building up his own business empire this Anthony Constantinou - named by the media as "UK's Wolf of Wall Street" - has been in the news for all the wrong reasons: first with the fraud investigation of his CWM business and then with his trial and recent conviction for sexual assaults on a number of women who worked for him.

News reports on Anthony Constantinou

Now, incredibly, it has been discovered that social media accounts (twitter, facebook, pinterest, youtube channel and blog) in the name of the convicted Anthony Constantinou are claiming the academic and research achievements of Dr Anthony Constantinou. We have no way of knowing whether these accounts are authentic or if they were created for malicious reasons by a third party but somebody has certainly taken a lot of trouble to create this identify theft:

this twitter account with his photo is especially deceptive because it claims Dr Constantinou's achievements but also has articles with his views on football (Dr Constantinou publishes weekly premiership match research-driven predictions based on his pi-football website).
this blogspot account called ‘anthonyisback'
this youtube channel - the powerpoint videos that appear here confirm somebody has gone to significant effort to carry out the identity fraud.
Facebook account
the 'reliable Anthony Constantinou updates' pinterest site.
this twitter account in the name of CWM World

Tuesday 8 November 2016

Confusion over the Likelihood Ratio

7 Jan 2017: There is an update to this post here.

The 'Likelihood Ratio' (LR) has been dominating discussions at the third workshop in our Isaac Newton Institute Cambridge Programme Probability and Statistics in Forensic Science.
There have been many fine talks on the subject - and these talks will be available here for those not fortunate enough to be attending.

We have written before (see links at bottom) about some concerns with the use of the LR. For example, we feel there is often a desire to produce a single LR even when there are multiple different unknown hypotheses and dependent pieces of evidence (in such cases we feel the problem needs to be modelled as a Bayesian network)- see [1]. Based on the extensive discussions this week, I think it is worth recapping on another one of these concerns (namely when hypotheses are non-exhaustive).

To recap: The LR is a formula/method that is recommended for use by forensic scientists when presenting evidence - such as the fact that DNA collected at a crime scene is found to have a profile that matches the DNA profile of a defendant in a case. In general, the LR is a very good and simple method for communicating the impact of evidence (in this case on the hypothesis that the defendant is the source of the DNA found at the crime scene).

To compute the LR, the forensic expert is forced to consider the probability of finding the evidence under both the prosecution and defence hypotheses. So, if the prosecution hypothesis Hp is "Defendant is the source of the DNA found" and the defence hypothesis Hp is "Defendant is not the source of the DNA found" then we compute both the probability of the evidence given Hp - written P(E | Hp) - and the probability of the evidence given Hd - written P(E | Hd). The LR is simply the ratio of these two likelihoods, i.e. P(E | Hp) divided by P(E | Hd).

The very act of considering both likelihood values is a good thing to do because it helps to avoid common errors of communication that can mislead lawyers and juries (notably the prosecutor's fallacy). But, most importantly, the LR is a measure of the probative value of the evidence. However, this notion of probative value is where misunderstandings and confusion sometimes arise. In the case where the defence hypothesis is the negation of the prosecution hypothesis (i.e. Hd is the same as "not Hp" as in our example above) things are clear and very powerful because, by Bayes theorem:

when the LR is greater than one the evidence supports the prosecution hypothesis (increasingly for larger values) - in fact the posterior odds of the prosecution hypothesis increase by a factor of LR over the prior odds.
when the LR is less than one it supports the defence hypothesis (increasingly as the LR gets closer to zero) - the posterior odds of the defence hypothesis increase by a factor of LR over the prior odds.
when the LR is equal to one then the evidence supports neither hypothesis and so is 'neutral' - the posterior odds of both hypotheses are unchanged from their prior odds. In such cases, since the evidence has no probative value lawyers and forensic experts believe it should not be admissible.

However, things are by no means as clear and powerful when the hypotheses are not exhaustive (i.e. the negation of each other) and in most forensic applications this is the case. For example, in the case of DNA evidence, while the prosecution hypothesis Hp is still "defendant is source of the DNA found" in practice the defence hypothesis Hd is often something like "a person unrelated to the defendant is the source of the DNA found".

In such circumstances the LR can only help us to distinguish between which of the two hypotheses is more likely, so, e.g. when the LR is greater than one the evidence supports the prosecution hypothesis over the defence hypothesis (with larger values leading to increased support). However, unlike the case for exhaustive hypotheses, the LR tells us nothing about the change in odds of the prosecution hypothesis. In fact, it is quite possible that the LR can be very large - i.e. strongly supporting the prosecution hypothesis over the defence hypothesis - even though the posterior probability of the prosecution hypothesis goes down. This rather worrying point is not understood by all forensic scientists (or indeed by all statisticians). Consider the following example (it's a made-up coin tossing example, but has the advantage that the numbers are indisputable):

Fred claims to be able to toss a fair coin in such a way that about 90% of the time it comes up Heads. So the main hypothesis is

H1: Fred has genuine skill

To test the hypothesis, we observe him toss a coin 10 times. It comes out Heads each time. So our evidence E is 10 out of 10 Heads. Our alternative hypothesis is:

H2: Fred is just lucky.

By Binomial theorem assumptions, P(E | H1) is about 0.35 while P(E | H2) is about 0.001. So the LR is about 350, strongly in favour of H1.

However, the problem here is that H1 and H2 are not exhaustive. There could be another hypotheses H3: "Fred is cheating by using a double-headed coin". Now, P(E | H3) = 1.

If we assume that H1, H2 and H3 are the only possible hypotheses* (i.e. they are exhaustive) and that the priors are equally likely, i.e. each is equal to 1/3 then the posteriors after observing the evidence E are:

H1: 0.25907 H2: 0.00074 H3: 0.74019

So, after observing the evidence E, the posterior for H1 has actually decreased despite the very large LR in its favour over H2.

In the above example, a good forensic scientist - if considering only H1 and H2 - would conclude by saying something like

"The evidence shows that hypothesis H1 is 350 times more likely than H2, but tells us nothing about whether we should have greater belief in H1 being true; indeed, it is possible that the evidence may much more strongly support some other hypothesis not considered and even make our belief in H1 decrease".

However, in practice (and I can confirm this from having read numerous DNA and other forensic case reports) no such careful statement is made. In fact, the most common assertion used in such circumstances is:

"The evidence provides strong support for hypothesis H1"

Such an assertion is not only mathematically wrong but highly misleading. Consider, as discussed above, a DNA case where:

Hp is "defendant is source of the DNA found"
Hd is "a person unrelated to the defendant is the source of the DNA found".

This particular Hd hypothesis is a common convenient choice for the simple reason that P(E | Hd) is relatively easy to compute (it is the 'random match probability'). For single-source, high quality DNA this probability can be extremely small - of the order of one over several billions; since P(E | Hp) is equal to 1 in this case the LR is several billions. But, this does NOT provide overwhelming support for Hp as is often assumed unless we have been able to rule out all relatives of the defendant as suspects. Indeed, for less than perfect DNA samples it is quite possible for the LR to be in the order of millions but for a close relative to be a more likely source than the defendant.

While confusion and misunderstandings can and do occur as a result of using hypotheses that are not exhaustive, there are many real examples where the choice of such non-exhaustive hypotheses is actually negligent. The following worrying example is based on a real case (location details changed as an appeal is ongoing):

The suspect is accused of committing a crime in a particular rural location A near his home village in Dorset. The evidence E is soil found on the suspect's car. The prosecution hypothesis Hp is "the soil comes from A". The suspect lives (and drives) near this location but claims he did not drive to that specific spot. To 'test' the prosecution hypothesis a soil expert compares Hp with the hypothesis Hd: "the soil comes from a different rural location". However, the 'different rural location' B happens to be 500 miles away in Perth Scotland (simply because it is close to where the soil analyst works and he assumes soil from there is 'typical' of rural soil). To carry out the test the expert considers soil profiles of E and samples from the two sites A and B.

Inevitably the LR strongly favours Hp (i.e. site A) over Hd (i.e. site B); the soil profile on the car - even if it was never at location A - is going to be much closer to the A profile than the B profile. But we can conclude absolutely nothing about the posterior probability of A. The LR is completely useless - it tells us nothing other than the fact that the car was more likely to have been driven in the rural location in Dorset than in a a rural location in Perth. Since the suspect had never driven the car outside Dorset this is hardly a surprise. Yet, in the case this soil evidence was considered important since it was wrongly assumed to mean that it "provided support for the prosecution hypothesis".

This example also illustrates, however, why in practice it can be impossible to consider exhautive hypotheses. For such soil cases, it would require us to consider samples from every possible 'other' location. What an expert like Pat Wiltshire (who is also a participant on the FOS programme) does is to choose alternative sites close to the alleged crime scene and compare the profile of each of those and the crime scene profile with the profile from the suspect. While this does not tell us if the suspect was at the crime scene it can tell us how much more likely the suspect was to have been there rather than sites nearby.

*as pointed out by Joe Gastwirth there could be other hypotheses like "Fred uses the double-headed coin but switches to a regular coin after every 9 tosses".

References

Fenton N.E, Neil M, Berger D, “Bayes and the Law”, Annual Review of Statistics and Its Application, Volume 3, 2016 (June), pp 51-77 http://dx.doi.org/10.1146/annurev-statistics-041715-033428 .Pre-publication version here and here is the Supplementary Material See also blog posting.
Fenton, N. E., D. Berger, D. Lagnado, M. Neil and A. Hsu, (2013). "When ‘neutral’ evidence still has probative value (with implications from the Barry George Case)", Science and Justice, http://dx.doi.org/10.1016/j.scijus.2013.07.002. A pre-publication version of the article can be found here.

Friday 7 October 2016

Bayesian Networks and Argumentation in Evidence Analysis

Some of the workshop participants

On 26-29 September 2016 a workshop on "Bayesian Networks and Argumentation in Evidence Analysis" took place at the Isaac Newton Institute Cambridge. This workshop, which was part of the FOS Programme was also the first public workshop of the ERC-funded project Bayes-Knowledge (ERC-2013-AdG339182-BAYES_KNOWLEDGE).

The workshop was a tremendous success, attracting many of the world's leading scholars in the use of Bayesian networks in law and forensics. Most of the presentations were filmed and can now be viewed here.

There was also a pre-workshop meeting on 23-24 September where participants focused on an important Dutch case that recently went to appeal. The partcipants were divided into two groups - one group developed a BN model of the case and the other developed an agumentation/scenarios-based model of the case. We plan to further develop these and write up the results.


Some of the participants at the pre-workshop meeting anyalysing a specific Dutch case

The Bayesian Networks mutual exclusivity problem

Several years ago when we started serious modelling of legal arguments using Bayesian networks we hit a problem that we felt would be easily solved. We had a set of mutually exclusive events such as "X murdered Y, Z murdered Y, Y was not murdered" that we needed to model as separate variables because they had separate causal pathways and evidence.

It turned out that existing BN modelling techniques cannot capture the correct intuitive reasoning when a set of mutually exclusive events need to be modelled as separate nodes instead of states of a single node. The standard proposed ’solution’, which introduces a simple constraint node that enforces mutual exclusivity, fails to preserve the prior probabilities of the events and is therefore flawed.

In 2012 myself (and the co-authors listed below) produced an initial novel and simple solution to this problem that works in a reasonable set of circumstances, but it proved to be difficult to get people to understand why the problem was an important one that needed to be solved. After many changes and iterations this work has finally been published and, as a 'gold access paper' it is free for anybody to download in full (see link below).

During the current Programme "Probability and Statistics in Forensic Science" that I am helping to run at the Isaac Newton Institute for Mathematical Sciences, Cambridge, 18 July - 21 Dec 2016, it has become clear that the mutual exclusivity problem is critical in any legal case where there are diverse prosecution and defence narratives. Although our solution does not work in all cases (and indeed we are working on more comprehsive approaches) we feel it is an important start.

Norman Fenton, Martin Neil, David Lagnado, William Marsh, Barbaros Yet, Anthony Constantinou, "How to model mutually exclusive events based on independent causal pathways in Bayesian network models", Knowledge-Based Systems, Available online 17 September 2016
http://dx.doi.org/10.1016/j.knosys.2016.09.012

Saturday 17 September 2016

Bayesian networks: increasingly important in cross disclipinary work

The growing importance of Bayesian networks was demonstrated this week by the award of a prestigious Leverhulme Trust Research Project Grant of £385,510 to Queen Mary University of London that ultimately will lead to improved design and use of self-monitoring systems such as blood sugar monitors, home energy smart meters, and self-improvement mobile phone apps.

The project, CAUSAL-DYNAMICS ("Improved Understanding of Causal Models in Dynamic Decision-making") is a collaborative project, led by Professor Norman Fenton of the School of Electronic Engineering and Computer Science, with co-investigators Dr Magda Osman (School of Biological and Chemical Sciences), Prof Martin Neil (School of Electronic Engineering and Computer Science) and Prof David Lagnado (Department of Experimental Psychology, University College London).

The project exploits Fenton and Neil's expertise in causal modelling using Bayesian networks and Osman and Lagnado's expertise in cognitive decision making. Previously, psychologists have extensively studied dynamic decision-making without formally modelling causality while statisticians, computer scientists, and AI researchers have extensively studied causality without considering its central role in human dynamic decision making. This new project starts with the hypothesis that we can formally model dynamic decision-making from a causal perspective. This enables us to identify both where sub-optimal decisions are made and to recommend what the optimal decision is. The hypothesis will be tested in real world examples of how people make decisions when interacting with dynamic self-monitoring systems such as blood sugar monitors and energy smart meters and will lead to improved understanding and design of such systems.

The project is for 3 years starting Jan 2017. For further details, see: CAUSAL-DYNAMICS.

WATCH THIS SPACE FOR THE ANNOUNCEMENT VERY SOON OF TWO OTHER MAJOR NEW CROSS-DISCIPLINARY BAYESIAN NETWORK PROJECTS!!

About the Leverhulme Trust
The Leverhulme Trust was established by the Will of William Hesketh Lever, the founder of Lever Brothers. Since 1925 the Trust has provided grants and scholarships for research and education; today it is one of the largest all-subject providers of research funding in the UK, distributing approximately £80 million a year. For more information: www.leverhulme.ac.uk / @LeverhulmeTrust

Friday 16 September 2016

Bayes and the Law: what's been happening in Cambridge and how you can see it

Programme Organisers (left to right): R Gill, D Lagnado, L Schneps, D Balding, N Fenton

Since 21 July 2016 I have been running the Isaac Newton Institute (INI) Programme on Probability and Statistics in Forensic Science in Cambridge.

For those of you who were not fortunate enough to be at the first formal workshop "The nature of questions arising in court that can be addressed via probability and statistical methods" (30 August to 2 September) you can watch the full videos here of most of the 35 presentations on the INI website. The presentation slide are also available in the INI link..

The workshop attracted many of the world's leading figures from the law, statistics and forensics with a mixture of academics (including mathematicians and legal scholar), forensic practitioners, and practicing lawyers (including judges and eminent QCs). It was rated a great success.

The second formal workshop "Bayesian Networks and Argumentation in Evidence Analysis" will take place on 26-29 September. It is also part of the BAYES-KNOWLEDGE project programe of work. For those who wish to attend, but cannot, the workshop will be streamed live.

Norman Fenton, 16 September 2016

Links

Friday 1 July 2016

The likelihood ratio and why its use in forensic analysis is often flawed

FORREST 2016 (for details see here)

I am giving the opening address at the Forensic Institute 2016 Conference (FORREST 2016) in Glasgow on 5 July 2016. The talk is about the benefits and pitfalls of using the likelihood ratio to help understand the impact of forensic evidence. The powerpoint slide show for my talk is here.

While a lot of the material is based on our recent Bayes and the Law paper, there is a new simple example of the danger of using the likelihood ratio (LR) when the defence hypothesis is not the negation of the prosecution hypothesis. Recall that the LR for some evidence E is the probability of E given the prosecution hypothesis divided by the probability of E given the defence hypothesis. The reason the LR is popular is because it is a measure of the probative value of the evidence E in the sense that:

LR>1 means E supports the prosecution hypothesis
LR<1 means E supports the defence hypothesis
LR=1 means E has no probative value

This follows from Bayes Theorem but only when the defence hypothesis is the negation of the prosecution hypothesis. The problem is that there are Forensic Science Guidelines* that explicitly state that this requirement is not necessary. But if the requirement is not met then it is possible to have LR<1 even though E actually supports the prosecution hypothesis. Here is the example:

A raffle has 100 tickets numbered 1 to 100

Joe buys 2 tickets and gets numbers 3 and 99

The ticket is drawn but is blown away in the wind.

Joe says the ticket drawn was 99 and demands the prize, but the organisers say 99 was not the winning ticket. In this case the prosecution hypothesis H is “Joe won the raffle”.

Suppose we have the following evidence E presented by a totally reliable eye witness:

E: “winning ticket was an odd nineties number (i.e. 91, 93, 95, 97, or 99)”

Does the evidence E support H? let's do the calculations:

Probability of E given H = ½

Probability of E given not H = 4/98

So the LR is (1/2)/(4/98) = 12.25

That means the evidence CLEARLY supports H. In fact, the probability of H increases from a prior of 1/50 to a posterior of 1/5, so thee is no doubt it is supportive.

But suppose the organisers’ assert that their (defence) hypothesis is:

H’: “Winning ticket was a number between 95 and 97”

Then in this case we have:

Probability of E given H = ½

Probability of E given H’ = 2/3

So the LR is ( 1/2)/(2/3) = 0.75

That means that in this case the evidence supports H’ over H. The problem is that, while the LR does indeed 'prove' that the evidence is more supportive of H' than H that is actually irrelevant unless there is other evidence that proves that H' is the only possible alternative to H (i.e. that H' equivalent to 'not H'). In fact, the 'defence' hypothesis has been cherry picked. The evidence E supports H irrespective of which cherry-picked alternative is considered.

Norman Fenton, 1 July 2016

*Jackson G, Aitken C, Roberts P. 2013. Practitioner guide no. 4. Case assessment and interpretation of expert evidence: guidance for judges, lawyers, forensic scientists and expert witnesses. London: R. Stat. Soc. http://www.maths.ed.ac.uk/∼cgga/Guide-4-WEB.pdf. Page 29: "The LR is the ratio of two probabilities, conditioned on mutually exclusive (but not necessarily exhaustive) propositions."

See also:

Friday 17 June 2016

Bayes and the Law: Cambridge event and new review paper

When we set up the Bayes and the Law network in 2012 we made the following assertion:

Proper use of statistics and probabilistic reasoning has the potential to improve dramatically the efficiency, transparency and fairness of the criminal justice system and the accuracy of its verdicts, by enabling the relevance of evidence – especially forensic evidence - to be meaningfully evaluated and communicated. However, its actual use in practice is minimal, and indeed the most natural way to handle probabilistic evidence (Bayes) has generally been shunned.

The first workshop (30th August to 2nd September 2016) that is part of our 6-month programme "Probability and Statistics in Forensic Science" at the Issac Newton Institute of Mathematics Cambridge directly addresses the above assertion and seeks to understand the scope, limitations, and barriers of using statistics and probability in court. The Workshop brings together many of the world's leading academics and pracitioners (including lawyers) in this area. Information on the programme and how to participate can be found here.

A new review paper* "Bayes and the Law" has just been published in Annual Review of Statistics and Its Application.

This paper reviews the potential and actual use of Bayes in the law and explains the main reasons for its lack of impact on legal practice. These include misconceptions by the legal community about Bayes’ theorem, over-reliance on the use of the likelihood ratio and the lack of adoption of modern computational methods. The paper argues that Bayesian Networks (BNs), which automatically produce the necessary Bayesian calculations, provide an opportunity to address most concerns about using Bayes in the law.

*Full citation:

Fenton N.E, Neil M, Berger D, “Bayes and the Law”, Annual Review of Statistics and Its Application, Volume 3, pp51-77, June 2016 http://dx.doi.org/10.1146/annurev-statistics-041715-033428. Pre-publication version is here and the Supplementary Material is here.

Monday 6 June 2016

Using expert judgment to build better decision support models

The 'big data' juggernaut seems to be rumbling along with many oblivious to the limitations of what pure machine learning techniques can really achieve in most important applications. We have written here before about the dangers of 'learning' from data alone (no matter how 'big' the data is).

Contrary to the narrative being sold by many in the big data community, if you want accurate predictions and improved decision-making then, invariably, you need to incorporate human knowledge and judgment. Much of the research in the BAYES-KNOWLEDGE project is concerned with building better decision-support models - normally Bayesian networks (BNs) - by incorporating knowledge and data.

There are two major steps to building a BN model for a decision analysis problem:

Identify the key variables and which ones directly influence each other.
Define the probability tables for each variable conditioned on its parents

We have been reporting on this blog about various recent papers from the project that have addressed these steps, most in the context of case studies*, while some of the project work on combining judgement and data to learn the probability tables has been incorporated into the BAYES-KNOWLEDGE tool on the Agenarisk platform.

Now new research (supported jointly by BAYES-KNOWLEDGE and the China Scholarship Council) has been published in the top ranked journal "Decision Support Systems" that describes an important advance in defining the probability tables of a BN. The paper shows that, in practice, many of the variables in a BN model are related by certain types of 'monotonic constraints'. As a very simple example consider a model in which the variable "Lung cancer" has the parent "Smoking". Although we do not know the exact relationship between these variables it is known that as probability values of "Smoking" increase so do the probability values of "Lung cancer". So this is an example of a positive monotonic constraint. It turns out that, even with fairly minimal data, it is possible to exploit an expert's knowledge about the existence of monotonic constraints to learn complete probability tables that lead to accurate and useful models. This is important because most approaches to incorporating expert judgement to define the probability tables requires the expert to consider multiple combinations of variables states.

The full citation for this new paper is:

Zhou, Y., Fenton, N. E., Zhu, C. (2016), "An Empirical Study of Bayesian Network Parameter Learning with Monotonic Causality Constraints", Decision Support Systems. http://dx.doi.org/10.1016/j.dss.2016.05.001 pre-publication pdf version here.

*See:

Wednesday 1 June 2016

Bayesian networks for Cost, Benefit and Risk Analysis of Agricultural Development Projects

Successful implementation of major projects requires careful management of uncertainty and risk. Yet, uncertainty is rarely effectively calculated when analysing project costs and benefits. In the case of major agricultural and other development projects in Africa this challenge is especially important.

A paper just published* in the journal Experts Systems with Applications presents a Bayesian network (BN) modelling framework to calculate the costs, benefits, and return on investment of a project over a specified time period, allowing for changing circumstances and trade-offs. Marianne Gadeberg and Eike Luedeling have written an overview of the work here.

The framework uses hybrid and dynamic BNs containing both discrete and continuous variables over multiple time stages. The BN framework calculates costs and benefits based on multiple causal factors including the effects of individual risk factors, budget deficits, and time value discounting, taking account of the parameter uncertainty of all continuous variables. The framework can serve as the basis for various project management assessments and is illustrated using a case study of an agricultural development project. The work was a collaboration between the World Agroforestry Centre (ICRAF), Nairobi, Kenya, the Risk Information Management Group at Queen Mary (as part of the BAYES-KNOWLEDGE project) and Agena Ltd.

*The full reference is:

Yet, B., Constantinou, A., Fenton, N., Neil, M., Luedeling, E., & Shepherd, K. (2016). "A Bayesian Network Framework for Project Cost, Benefit and Risk Analysis with an Agricultural Development Case Study" . Expert Systems with Applications, Volume 60, 30 October 2016, Pages 141–155. DOI: 10.1016/j.eswa.2016.05.005.

Until July 2016 the full published pdf is available for free. A permanent pre-publication pdf is available here.

See also: Can we build a better project: assessing complexities in development projects

Acknowledgements: Part of this work was performed under the auspices of EU project ERC-2013-AdG339182-BAYES_KNOWLEDGE and part under ICRAF Contract No SD4/2012/214 issued to Agena. We acknowledge support from the Water, Land and Ecosystems (WLE) program of the Consultative Group on International Agricultural Research (CGIAR).

Thursday 26 May 2016

Using Bayesian networks to assess new forensic evidence in an appeal case

If new forensic evidence becomes available after a conviction how do lawyers determine whether it raises sufficient questions about the verdict in order to launch an appeal? It turns out that there is no systematic framework to help lawyers do this. But a paper published today by Nadine Smit and colleagues in Crime Science presents such a framework driven by a recent case, in which a defendant was convicted primarily on the basis of sound evidence, but where subsequent analysis of the evidence revealed additional sounds that were not considered during the trial.

From the case documentation, we know the following:

A baby was injured during an incident on the top floor of a house
Blood from the baby was found on the wall in one of the rooms upstairs
On an audio recording of the emergency telephone call made by the suspect, a scraping sound (allegedly indicating scraping blood off a wall) can be heard
The suspect was charged with attempted murder

The audio evidence played a significant role in the trial. But, during the appeal preparation process, the call was re-analysed by an audio expert on behalf of the defence, and four other sounds were identified on the same recording that, according to the expert, showed similarities to the original sound. In particular, one of these sounds was of interest because of background noise that could be heard simultaneously. The background noise was presumed to be the television, which was located in a different room to where the prosecution argued the scraping of the blood took place. During this second sound, the TV (located downstairs) could be heard simultaneously on the emergency recording. A statement by the police reads that the suspect was frequently rubbing his face in their presence. The defence proposed that the incriminating sound in the recording was not blood scraping after all, but simply the defendant rubbing his face.

The framework described in Smit's paper is intended to overcome the gap between what is generally known from scientific analyses and what is hypothesized in a legal setting. It is based on Bayesian networks (BNs) which are a structured and understandable way to evaluate the evidence in the specific case context and present it in a clear manner in court. However, BN methods are often criticised for not being sufficiently transparent for legal professionals. To address this concern the paper shows the extent to which the reasoning and decisions of the particular case can be made explicit and transparent. The BN approach enables us to clearly define the relevant propositions and evidence, and uses sensitivity analysis to assess the impact of the evidence under different prior assumptions. The results show that such a framework is suitable to identify information that is currently missing, and clearly crucial for a valid and complete reasoning process. Furthermore, a method is provided whereby BNs can serve as a guide to not only reason with incomplete evidence in forensic cases, but also identify very specific research questions that should be addressed to extend the evidence base to solve similar issues in the future.

Full citation:

Smit, N. M., Lagnado, D. A., Morgan, R. M., & Fenton, N. E. (2016). "An investigation of the application of Bayesian networks to case assessment in an appeal case". Crime Science, 2016, 5: 9, DOI 10.1186/s40163-016-0057-6 (open source). Published version pdf.

The research was funded by the Engineering and Physical Sciences Research Council of the UK through the Security Science Doctoral Research Training Centre (UCL SECReT) based at University College London (EP/G037264/1), and the European Research Council (ERC-2013-AdG339182-BAYES_KNOWLEDGE).

The BN model (which is fully spceified in the paper) was built and run using the free version of AgenaRisk.

Wednesday 27 April 2016

Research Showcase, Queen Mary University of London 2016

Some shots of BAYES-KNOWLEDGE staff from today's Research Showcase event.

Tuesday 26 April 2016

Hillsborough Inquest - my input

With today's verdict (fans unlawfully killed) coming after more than two years I can now speak about my own involvement in the Inquest.

Because of the years that have passed few people are aware that there was a 'near-miss' disaster at Hillsborough eight years before the actual disaster. The circumstances were essentially identical - an FA Cup Semi Final with far too many supporters let in to the Leppings Lane stand leading to a massive crush. Because of the quick thinking of a steward who was able to open a gate onto the pitch nobody died on that occasion (although there were many injuries). I know this because I was present at that earlier near disaster and I was, in fact, Secretary of the Sheffield Spurs Supporters Club. At the time I wrote to the FA and South Yorkshire police as I felt mistakes had been made, and indeed the incident was sufficiently serious that Hillsborough (which had been used every year as one of the two semi-final venues) was avoided until 1988 (the year before the disaster). Immediately after the disaster in 1989 I wrote to the FA and Lord Taylor (who led the original enquiry) to inform them of the events of 1981. Although I was interviewed at that time by the Police investigators, my evidence was never used.

In 2014 - out of the blue - I was asked to attend the new Hillsborough Inquest as it had been decided that the 1981 incident was an important piece of the story. Here are a couple of links to media reports about my appearance:

Norman Fenton, 26 April 2016

Thursday 21 April 2016

Evangelia Kyrimi wins first prize for best poster at annual UK Meeting on Causal Inference

Evangelia Kyrimi - PhD student supervised by Dr William Marsh has won first prize for the best poster at the annual UK Meeting on Causal Inference that took place last week in London. Her poster was titled "A progressive explanation of causal inference in 'hybrid' Bayesian Networks for supporting clinical decision making" (click on link to download the full poster) .

It explains that, although many ‘causal’ models (notably in medicine) have been developed as decision tools, few of them have been used in practice and this is often due to lack of perceived trustworthiness of the model. Giving users an explanation of the model’s reasoning is crucial for effective decision support. Evangelia's research provides a coherent explanation of inference that can be applied to any causal Bayesian Network model.

The prize included some excellent books...

Friday 25 March 2016

Statistics of coincidences: Ben Geen case revisited (ABC)

In November 2014 I reported on the case of nurse Ben Geen who was convicted in 2006 for murdering 2 patients and seriously harming 15 others. I had been asked to produce an expert report on the 'statistical coincidences' in the case for the Criminal Cases Review Board.

Now a 30-minute documentary on the case presented by Joel Werner is to be aired on Australia's national radio station ABC on 28 March. In the programme (which you can listen to in full from the links at the top of the ABC page) I present a lay summary of the statistical argument (from minutes 16:30 to 21:34).

Norman Fenton

Saturday 19 March 2016

Turning poorly structured data into intelligent Bayesian Network models for medical decision support

Medical data is very often badly structured, incomplete and inconsistent. This limits our ability to generate useful models for prediction and decision support if we rely purely on machine learning techniques. That means we need to exploit expert knowledge at various model development stages. This problem - which is common in many application domains - is tackled in a paper** published in the latest issue of Artificial Intelligence in Medicine.

The paper describes a rigorous and repeatable method for building effective Bayesian Network (BN) models from complex data - much of which comes in unstructured and incomplete responses by patients from questionnaires and interviews. Such data inevitably contains repetitive, redundant and contradictory responses; without expert knowledge learning a BN model from the data alone is especially problematic where we are interested in simulating causal interventions for risk management. The novelty of this work is that it provides a rigorous consolidated and generalised framework that addresses the whole life-cycle of BN model development. The method is validated using data from forensic psychiatry. The resulting BN models demonstrate competitive to superior predictive performance against the data-driven state-of-the-art models. More importantly, the resulting BN models go beyond improving predictive accuracy and into usefulness for risk management through intervention, and enhanced decision support in terms of answering complex clinical questions that are based on unobserved evidence.

The method is applicable to any application domain involving large-scale decision analysis based on such complex and unstructured information. It challenges decision scientists to reason about building models based on what information is really required for inference, rather than based on what data is available. Hence, it forces decision scientists to use available data in a much smarter way.

**The full reference for the paper is:

Constantinou, A. C., Fenton, N., Marsh, W., & Radlinski, L. (2016). "From complex questionnaire and interviewing data to intelligent Bayesian Network models for medical decision support".Artificial Intelligence in Medicine, Vol 67 pages 75-93. DOI http://dx.doi.org/10.1016/j.artmed.2016.01.002

For those who do not have access to the journal a pre-publication draft can be downloaded: http://constantinou.info/downloads/papers/complexBN.pdf

menu