Tuite v The Queen
[2015] VSCA 148
•12 June 2015
SUPREME COURT OF VICTORIA
COURT OF APPEAL
S APCR 2015 0007
| CLINTON TUITE | Applicant |
| v | |
| THE QUEEN | Respondent |
---
| JUDGES: | MAXWELL ACJ, REDLICH and WEINBERG JJA |
| WHERE HELD: | MELBOURNE |
| DATE OF HEARING: | 23 March 2015 |
| DATE OF ORDERS: | 27 March 2015 |
| DATE OF REASONS: | 12 June 2015 |
| MEDIUM NEUTRAL CITATION: | [2015] VSCA 148 |
| JUDGMENT APPEALED FROM: | [2014] VSC 662 (Emerton J) |
---
EVIDENCE – Admissibility – Opinion evidence – Expert forensic evidence – DNA samples – Likelihood ratios – New statistical methodology utilised – Whether shown to be reliable – Whether generally accepted – Whether appropriately validated – Whether probative value outweighed by danger of unfair prejudice – Decision of trial judge open – Appeal dismissed – Evidence Act 2008 ss 79(1), 137.
STATUTORY INTERPRETATION – Evidence – Admissibility – Opinion evidence – Requirement of ‘specialised knowledge’ – Whether implied requirement of evidentiary reliability – Whether field of knowledge must be accepted or established – Whether opinion must be based on ‘good grounds’ – Honeysett v The Queen (2014) 88 ALJR 786 applied, Daubert v Merrell Dow Pharmaceuticals Inc (1993) 509 US 579 distinguished – Evidence Act 2008 s 79(1).
WORDS AND PHRASES – ‘Specialised knowledge’, ‘knowledge’.
---
| APPEARANCES: | Counsel | Solicitors |
| For the Applicant | Mr J Desmond | Doogue O’Brien George |
| For the Crown | Dr N Rogers SC With Mr B Sonnet | Ms V Anscombe, Acting Solicitor for Public Prosecutions |
MAXWELL ACJ
REDLICH JA
WEINBERG JA:
Summary
This interlocutory appeal raises questions of general importance about the admissibility of expert scientific evidence. At issue in the appeal is the reliability of DNA evidence which utilises a relatively new statistical methodology. Two key questions arise, as follows:
(a) is reliability a criterion of admissibility of opinion evidence under s 79(1) of the Evidence Act 2008 (the ‘Act’), or is reliability to be assessed in deciding whether the evidence should be excluded (under s 135 or s 137); and
(b) by what criteria is the reliability of expert scientific evidence to be assessed?
The applicant has been charged with aggravated burglary, rape, indecent assault and intentionally causing injury. Expert opinion evidence is to be called at the trial about the analysis of DNA samples from the crime scene and what are said to be the similarities between those samples and a DNA sample provided by the applicant following an unrelated conviction.
The DNA evidence is to be presented in the usual form of a ‘likelihood ratio’.[1] That is, for each DNA sample where the suspect cannot be excluded as a contributor, a ratio is calculated which shows how much more likely it is that the suspect was the source of the DNA (or a contributor to it) than that some other person chosen at random from the population was the source (or a contributor).
[1]See R v Berry (2007) 17 VR 153, 161 [30] (‘Berry’).
In this case, the likelihood ratios have been calculated using a recently-developed software package, known as STRmix, which was introduced into Victoria in March 2013. At a pre-trial hearing, the applicant challenged the admissibility of the DNA evidence on the ground that the new methodology was not — or had not been shown to be — sufficiently reliable for use in criminal trials. The methodology was largely untested, it was said, and had not been generally accepted by the forensic science community.
The novelty of the methodology and its lack of proven reliability meant, according to the argument, that the opinion evidence must be excluded. This was said to be so because either:
(c) the opinions were not based on ‘specialised knowledge’ within the meaning of s 79(1) of the Act, and the evidence was therefore inadmissible; or
(d) even if the evidence were admissible under s 79(1), its probative value was outweighed by the danger of unfair prejudice and the evidence must therefore be excluded under s 137 of the Act.
The pre-trial hearing extended over some 22 days, in the course of which the judge heard evidence from the three prosecution experts and one defence expert. In a reserved decision, her Honour rejected the application to exclude the evidence. Her Honour delivered comprehensive reasons for her conclusion that the evidence was admissible under s 79(1) and that s 137 did not require its exclusion.
The applicant sought leave to appeal against the ruling. He had first sought and obtained a certificate from the judge that the decision concerned the admissibility of evidence which, had it been ruled inadmissible, would have ‘eliminated or substantially weakened’ the prosecution case.[2] The application for leave to appeal was heard on 23 March 2015. On 27 March, we announced that the application for leave to appeal would be refused and that we would publish our reasons in due course. These are those reasons.
[2]Such a certificate is a pre-condition of an interlocutory appeal where the decision under challenge concerns the admissibility of evidence: Criminal Procedure Act 2009 s 295(3)(a).
Two preliminary points should be made. First, as the decisions of this Court have made clear, where a question is raised on an interlocutory appeal about the admissibility/exclusion of evidence, appellate intervention is limited by the principles in House v The King.[3] The question at this stage is whether the judge’s decision was reasonably open, not whether it was correct. A different standard of review applies if the issue is raised on a conviction appeal.[4]
[3](1936) 55 CLR 499.
[4]McCartney v The Queen (2012) 38 VR 1, 11–12 [46]–[51].
Secondly, her Honour had the benefit of hearing from each of the three Crown experts, and from a defence expert, over some 22 days of hearing. It is simply not possible at this stage, nor is it necessary given the nature of an interlocutory appeal, for this Court to acquire the same level of technical understanding of the particular field of scientific learning.
For reasons which follow, we concluded that:
(e) the question of the reliability of an expert opinion does not fall to be considered under s 79(1);
(f) it was open to the trial judge, on the evidence before her, to conclude that the opinion evidence of the Crown witnesses was based upon their specialised knowledge, and was therefore admissible under s 79(1);
(g) the question of the reliability of opinion evidence falls to be determined as part of the assessment which the Court undertakes for the purposes of s 137;[5]
(h) there was no error of principle in her Honour’s approach to the assessment of reliability for that purpose; and
(i) it was open to her Honour, on the evidence before her, to conclude that the probative value of the opinion evidence was not outweighed by the danger of unfair prejudice and hence that s 137 did not require its exclusion.
[5]Or the Act s 135.
The obvious risk in a criminal trial when expert evidence is led from a forensic scientist is that a jury will give the evidence more weight than it deserves.[6] To prevent unfair prejudice of that kind, it is essential that the reliability of expert evidence be established to the court’s satisfaction (under s 137) before it is led. We have concluded that the touchstone of reliability for this purpose is proof of appropriate validation, both of the underlying science (where necessary) and of the particular methodology being employed.
[6]Murphy v The Queen (1989) 167 CLR 94, 130–1; R v Mohan [1994] 2 SCR 9, 21; Hannes v DPP (Cth) (2006) 165 A Crim R 151, 226 [289]–[290].
In the present case, the trial judge scrutinised the proposed evidence with great care. In particular, her Honour carefully examined the validation studies carried out in relation to the STRmix methodology and was satisfied that the requirement of evidentiary reliability had been met. That conclusion, too, was well open on the evidence.
The respondent acknowledged during the hearing of the application that its expert witnesses had given evidence before the trial judge which was additional to that contained in their existing witness statements. In our view, the respondent should prepare supplementary witness statements setting out any of that additional evidence which is to be relied upon at trial.
The failure of the interlocutory appeal does not affect in any way the ability of the applicant to raise the same issue again during the trial, in the event that materially different evidence is adduced, or on appeal in the event of a conviction.[7]
[7]Criminal Procedure Act 2009 s 297(3).
A. THE DNA EVIDENCE
Collection and analysis of the DNA samples[8]
[8]Paragraphs 15–21 are based on the judge’s reasons: DPP v Tuite [2014] VSC 662 (‘Reasons’).
When investigating police arrived at the complainant’s address, they found the complainant with a cable tie attached to her left wrist and duct tape wrapped around her neck. Crime scene investigators took a series of photographs and seized the following items for forensic examination:
·a makeshift blindfold made of a piece of dark material, apparently torn from the sleeve of a garment, with a shoelace tie;
·eight cable ties;
·the unused end of a cigarette; and
·two used cigarette butts.
The prosecution seeks to rely at trial on the results of DNA analyses of three samples taken from the blindfold fabric and its shoelace tie (referred to as Items 1-1, 1-2 and 1-3); trace samples taken from the entire area of the seven cable ties combined (referred to as Item 4-1); a portion of filter from one cigarette butt (referred to as Item 5A-1); and a portion of filter from the second cigarette butt (referred to as Item 5B-1).
The six samples in issue were initially analysed in 2007 using a 10 marker profiling kit known as ‘Profiler Plus’ (‘P+’). Two of the samples were re-analysed in 2013, using a much more sensitive and discriminating 21 marker profiling kit called ‘PowerPlex 21’ (‘PP21’).
As mentioned earlier, the likelihood ratio for each of the items has been calculated using the STRmix program, a statistical software package developed by scientists in Australia and New Zealand. It is used in New Zealand and in all States and Territories in Australia except Tasmania and the ACT. It was introduced into Victoria in March 2013.
The six items were analysed by Victoria Police Forensic Science Service (VPFSS). The results of the analysis were as follows:
(j) Item 1-1 (trace from the outside surface of the blindfold fabric)
Item 1-1 was analysed using P+ (10 markers). Analysis showed a partial mixed DNA profile from three contributors. The complainant is an assumed contributor. Using STRmix, it is estimated to be 23 million times more likely that the DNA profile obtained from Item 1-1 would occur if the DNA originated from the accused, the complainant and one unknown person than if it originated from the complainant and two unknown people chosen at random from the Australian Caucasian population.
(k) Item 1-2 (trace from the inside surface of the blindfold fabric)
Item 1-2 was analysed using P+. Analysis showed a mixed DNA profile from three contributors. The complainant is an assumed contributor. Using STRmix, it is estimated to be 9.7 million times more likely that the DNA profile obtained from Item 1-2 would occur if the DNA originated from the accused, the complainant and one unknown person than if it originated from the complainant and two unknown people chosen at random from the Australian Caucasian population.
(l) Item 1-3 (trace from the ends of the shoelace combined)
Item 1-3 was analysed using PP21 (21 markers). The analysis showed a mixed DNA profile from three contributors. The complainant is an assumed contributor. Using STRmix, it is estimated to be 2.7 sextillion[9] times more likely that the DNA profile obtained from Item 1-3 would occur if the DNA originated from the accused, the complainant and one unknown person than if it originated from the complainant and two unknown people chosen at random from the Australian Caucasian population. This is reported using the default likelihood ratio for PP21 analyses of 100 billion.
[9]A sextillion in the UK is the sixth power of a million (10 to the power of 36) and in the US is the seventh power of a thousand (10 to the power of 21).
(m) Item 4-1 (trace from the entire area of the seven cable ties combined)
Item 4-1 was analysed using PP21. The analysis showed a partial mixed DNA profile from two contributors. Using STRmix, it is estimated to be 35 million times more likely that the DNA profile obtained from Item 4-1 would occur if the DNA originated from the accused and an unknown person than if it originated from two unknown persons chosen at random from the Australian Caucasian population.
(n) Item 5A-1 (portion of filter of one cigarette butt)
Item 5A-1 was analysed using P+. Analysis showed a single source profile. Using STRmix, it is estimated to be 29 billion times more likely that the DNA profile would occur if the DNA originated from the accused than if it originated from an unknown person chosen at random from the Australian Caucasian population.
(o) Item 5B-1 (portion of filter of the other cigarette butt)
Item 5B-1 was also analysed using P+. Analysis showed a single source profile. It is estimated to be 29 billion times more likely that the DNA profile would occur if the DNA originated from the accused than if it originated from an unknown person chosen at random from the Australian Caucasian population.
Only the cigarette butts (Items 5A-1 and 5B-1) produced a single source profile. There were multiple contributors to the DNA profiles on the blindfold and the cable ties, making analysis of these DNA profiles and their comparison with the reference samples of the accused more complex. Item 1-1 and Item 4-1 produced only partial (or incomplete) profiles, again making analysis more complex.
In addition, the DNA in Item 4-1 is ‘low-template’ DNA (‘LTDNA’), in that the amount analysed is significantly less than the optimal amount of 0.5 nanograms. Some of the contributors to the DNA on the Items with multiple contributors also individually provided very low amounts of DNA and all of the profiles other than for Items 5A-1 and 5B-1 contain low level or sub-optimal peaks.
The new methodology[10]
[10]What follows in paras 22–31 is the trial judge’s description of how STRmix works, based on the evidence which she heard on the voir dire. The footnotes are her Honour’s. The applicant did not challenge any aspect of this description.
If DNA on a crime scene sample is of low quality or quantity, then the profile may be ambiguous. LTDNA profiles are more likely to exhibit ‘stochastic’, or random, effects which complicate their interpretation. These effects may include ‘drop in’, which are peaks in the profile that are not true alleles.[11] Conversely, alleles may go missing or ‘drop out’ of the profile, producing partial profiles only.
[11]An allele is an alternative form of a gene (one member of a pair) that is located at a specific position on a specific chromosome.
STRmix purports to account for many of these random effects in a DNA profile. It is based on continuous modelling of peak heights, and therefore applies no thresholds below which peaks will not be interpreted.[12] Described as ‘fully-continuous’, it purports to make use of all the information in the profile, including peaks appearing below what other systems used for the evaluation of DNA profiles would designate as a ‘stochastic threshold’.
[12]Other than a very low ‘detection threshold’ in order to eliminate from the analysis any background ‘noise’.
Importantly, STRmix also purports to take into account peaks that are not detected as present in the DNA profile, by positing the possibility that alleles have ‘dropped-out’ of the profile and giving them a value. How this is done, and whether it is legitimate, is a source of strong disagreement between the parties in this case.
As a statistical program, STRmix uses the Markov Chain Monte Carlo (‘MCMC’) process. In seeking to explain the observed profile, STRmix randomly chooses genotype sets for all the contributors, as well as a template and a degradation amount for each contributor and locus amplification efficiencies, gradually building up a picture of the components in the observed profile. Through this iterative process, STRmix identifies the genotype combinations that best explain the observed DNA profile, and it weights the genotype combinations according to how well they explain that profile.
At the start of the process, STRmix will therefore list all possible explanations for the observed profile, including genotypes that have drop-in and drop-out. It will also include multiple drop-out scenarios, as well as scenarios that do not require any drop-out or drop-in. It will produce a comprehensive list of all possible explanations for the profile. These possible explanations are then ranked depending on how well they explain the observed profile: the higher the ranking (or weight), the better the explanation they give.
A reference sample (for example, that of the accused) can then be compared to the list of weighted genotype combinations. If the genotypes in the reference sample are among the highest weighted combinations, a large likelihood ratio will be produced; if the genotypes in the reference sample are not among the listed combinations, the reference person will be excluded as a contributor.
However, a person of interest will not be excluded as a contributor even when his or her alleles are not detected at a marker in the observed profile if STRmix has posited the possibility of drop-out at that marker. STRmix may posit the presence of a dropped or ‘Q’ allele, and give a weighting to the genotype set that includes the Q allele. Because the Q allele could be any allele at the relevant marker, the person of interest will not be excluded. However, the weighting of the genotype set that includes the Q allele will be low, because STRmix assigns a negative number to account for drop-out, and the likelihood ratio decreases as a result.
STRmix determines the possibility or probability of drop-out by reference to peak height variability in the observed profile. It does not calculate drop-out empirically based on observed numbers of drop-outs. Instead, STRmix contains an internal model — known as ‘Model Maker’ — that produces a ‘variance value’ based on calibration data entered by the laboratory. Peak height variability in the observed profile is analysed having regard to the peak height variability experienced in the laboratory generally. The variance value is fed into the STRmix calculation to determine the probability of drop-out in the observed profile.[13]
[13]Dr Taylor gave evidence that there are a number of models that describe different DNA profile behaviours sitting within STRmix. There is a model for template amount and how that is related to peak heights; there is a model that describes degradation across the profile; and there is a model that describes the general amplification efficiency of each locus. There are some DNA profile behaviours that have not been modelled because they make up a minor part of the profile, for example, over-stutter. Some artefacts have not been modelled, for example, pull-up or dye blobs in the generation of electropherograms. They have to be removed manually by a typer. One of the essential components of using STRmix is that the typer has to look at the evidence profile first and decide whether or not it is suitable for analysis in STRmix.
The probabilistic and fully continuous methodology used by STRmix can be contrasted with a classical binary methodology for DNA analysis. Binary methods are rule and threshold based, and a binary determination of the evidence of a match versus a non-match produces probabilities of only one and zero. Using the probabilistic model, the likelihood of the evidence of a match versus a non-match can have any value between zero and one, depending on the weighting. The evidence is evaluated on a continuous scale.
Previously, VPFSS used the ‘SPURS’ methodology to produce likelihood ratios. SPURS is a binary (match or non-match) system. It considers possible genotype combinations from alleles detected as present (based on the application of rules and thresholds) and weights each possible genotype combination equally, whereas STRmix generates numbers of possible genotype combinations and assigns different weightings to them, including where the drop-out of alleles is posited in one or more of the combinations.
The judge commented:
An issue with using a probabilistic system like STRmix is that the results of no two analyses will be the same. Because the MCMC process is a random process, it calculates things slightly differently each time and it does not produce a ‘true’ likelihood ratio. This is relatively new to forensic science. Previously, the calculation of a likelihood ratio would produce a single value, and if the analysis was repeated, the same value would result. With STRmix, it is a question of how much variability is to be expected and whether the actual variability is within expectations so as to enable the information that has been produced to be used.[14]
[14]Reasons [29] (citation omitted).
We return to this issue below.[15]
[15]See [119]–[120] below.
The expert witnesses
The evidence of the likelihood ratios, as well as the underlying evidence comprising the profiles and case-notes, is proposed to be given by Ms Deborah Scott, who is a Forensic Officer and Senior Case Manager and Unit Leader in the Biological Examination Branch of the VPFSS. Ms Scott gave extensive oral evidence at the preliminary hearing, principally in response to questions in cross-examination.
In her first statement, Ms Scott sets out the likelihood ratios generated by STRmix for a large number of the crime scene samples, including the Items. In the second statement, Ms Scott comments on the reliability of interpreting DNA profiles when less than optimal amounts of DNA have been amplified.
As the judge noted, Ms Scott is an experienced forensic scientist. She has practised as a forensic scientist since May 2000, both in New South Wales and in Victoria. She has a Bachelor of Science degree with honours and a Master of Science in Forensic Science from the University of Strathclyde. She has successfully completed training requirements and is authorised by VPFSS in the interpretation of genetic typing of biological material and the statistical evaluation of DNA profiles.
Her Honour said:
Given Ms Scott’s qualifications and experience in the interpretation of genetic typing of biological material and the statistical interpretation of DNA profiles, I am satisfied that she has specialised knowledge about these matters, that is, knowledge that is outside that of persons who do not have her training, study and experience. She can explain what was done to produce the crime scene and reference sample DNA profiles and how it was done. She can interpret these profiles and evaluate them statistically, at least to the extent that she uses established statistical tools.
…
What appears to be in dispute is whether Ms Scott’s specialised knowledge about the statistical evaluation of DNA profiles enables her to give evidence about the likelihood ratios generated by the STRmix methodology specifically.[16]
[16]Reasons [35], [37].
Ms Scott has received training in the use of STRmix and was able to describe its methodology in broad terms, including how it generates and weights possible genotype combinations, and produces likelihood ratios. Importantly, she gave evidence of the validation studies carried out by VPFSS on the use of STRmix with both P+ and PP21 profiling kits.[17]
[17]See further [117] below.
At the preliminary hearing, the Crown adduced evidence about STRmix and its application in forensic casework from two further witnesses: Dr Duncan Taylor, who developed STRmix in conjunction with colleagues in New Zealand; and Ms Lisa Federle, a Senior Case Manager within the Biological Examination Branch of VPFSS, who carried out the validation studies for STRmix within the VPFSS. (We refer to these validation studies below).[18]
[18]See [117]–[118] below.
The judge noted in her reasons that Ms Federle
considered herself to have sufficient expertise to address the validity and operation of the STRmix program and the reliability of the likelihood ratios. She said she was not necessarily an expert in the mathematics behind STRmix, but she knew about its application to forensic case work and the validation work conducted on its use … by VPFSS. She was involved in the validation studies carried out by VPFSS and had received training that authorised her to train other scientists to use STRmix. She said she was therefore able to give evidence about the mechanics of the program, why it works and why it is sufficient for use in forensic case-work.[19]
[19]Reasons [39].
Her Honour continued:
However, STRmix involves the application of ‘black box’ technology, and its software is not open source. Neither Ms Scott nor Ms Federle could give evidence about the mathematical and statistical models underpinning STRmix, or the testing or validation of the software.
Evidence of this kind could, however, be given by Dr Taylor.[20]
[20]Ibid [40]–[41].
At the preliminary hearing, Dr Taylor gave evidence about the development and operation of STRmix. He has qualifications and experience in the fields of biology and biological statistics. He has a Diploma of Biostatistics from Biostatistics Collaboration of Australia, lectures in DNA statistics in the faculty of Biological Sciences at Flinders University and is the current chair of the Australasian DNA Statistic Advisory Group, of which he has been a member since 2009. He is one of the authors, along with Jo-Anne Bright and John Buckleton, of a paper titled ‘The Interpretation of Single Source and Mixed DNA Profiles’,[21] which describes the STRmix methodology and sets out the mathematical and biological models that underpin it.[22]
[21]Duncan Taylor, Jo-Anne Bright and John Buckleton, ‘The Interpretation of Single Source and Mixed DNA Profiles’ (2013) 7 Forensic Science International: Genetics 516.
[22]Her Honour referred to this paper in her reasons as ‘the STRmix research paper’.
B. ADMISSIBILITY UNDER S 79(1)
As noted earlier, the first ground of the applicant’s challenge to the opinion evidence was that it did not fall within the exception established by s 79(1). That subsection provides as follows:
If a person has specialised knowledge based on the person’s training, study or experience, the opinion rule does not apply to evidence of an opinion of that person that is wholly or substantially based on that knowledge.
As the judge explained, the principal complaint made by the defence was not that the witnesses were unqualified to give evidence about the statistical evaluation of DNA profiles and likelihood ratios. Instead, the objection was to the probabilistic methodology itself. As summarised in her Honour’s reasons, the defence contention was that
STRmix has not been shown to be a reliable tool for the statistical evaluation of DNA profiles. The defence argues that STRmix has not been properly validated for the use to which it has been put by VPFSS and it is not widely accepted by the forensic science community.
The defence witness, Ms Jane Taupin, gave evidence that there is no consensus in the literature that STRmix works or any description of how it works. While probabilistic statistical methodologies have been introduced in other jurisdictions, these other systems have not been evaluated. Neither the United States ‘Scientific Working Group on DNA Analysis Methods’ (‘SWGDAM’) nor the International Society for Forensic Genetics DNA Commission (‘ISFG’) has published guidelines for the use of a fully continuous probabilistic methodology. In particular, there are no recommendations as to how to calculate the probability of allele drop-out or more generally regarding the use of models based on peak height variation, like STRmix. Furthermore, the way in which STRmix determines the probability of drop-out is embedded in an internal system which is not open to evaluation. In Ms Taupin’s opinion, the inherent unreliability of peak heights in low level DNA means that STRmix should only be used where the DNA is not low-template DNA and there is less variance in peak heights. STRmix might be suitable for good quality DNA profiles, but it should not be used on poor quality profiles, particularly without recommendations for its use from either the ISFG or SWGDAM.
For the purpose of s 79 admissibility, the defence contends that the prosecution has not established that STRmix is a reliable body of knowledge in respect of which evidence based on ‘specialised knowledge’ can be given. This is because:
(a)The particular fully-continuous probabilistic methodology used or applied by STRmix is a discrete and new development in the area of DNA science; it involves a new or novel ‘area’ which does not constitute (at least not yet) a reliable body of knowledge; and
(b)Even if the Court came to the view that the fully continuous probabilistic methodology used and applied by STRmix was appropriate for use with optimum amounts of DNA, there is no reliable body of knowledge for its application in relation to complex mixtures and/or sub-optimal amounts of DNA.
…
The defence argument … is based on what counsel described as the ‘probabilistic methodology used by STRmix’ being a new and discrete field of knowledge that remains untested and lacks acceptance in the forensic science community. According to the defence, the methodology used by STRmix is not ‘a body of knowledge or experience which is sufficiently organised or recognised to be accepted as a reliable body of knowledge or experience’.[23]
[23]Reasons [42]–[44], [47] (emphasis added). As her Honour noted, the concluding quotation came from Velevski v The Queen (2002) 76 ALJR 402 (‘Velevski’): see [45] below.
In Honeysett v The Queen,[24] the High Court pointed out that s 79(1) states two conditions of admissibility:
[F]irst, the witness must have ‘specialised knowledge based on the person's training, study or experience’ and, second, the opinion must be ‘wholly or substantially based on that knowledge’. The first condition directs attention to the existence of an area of ‘specialised knowledge’. ‘Specialised knowledge’ is to be distinguished from matters of ‘common knowledge’. Specialised knowledge is knowledge which is outside that of persons who have not by training, study or experience acquired an understanding of the subject matter. It may be of matters that are not of a scientific or technical kind and a person without any formal qualifications may acquire specialised knowledge by experience. However, the person's training, study or experience must result in the acquisition of knowledge. The Macquarie Dictionary defines ‘knowledge’ as ‘acquaintance with facts, truths, or principles, as from study or investigation’ (emphasis added) and it is in this sense that it is used in s 79(1). The concept is captured in Blackmun J's formulation in Daubert v Merrell Dow Pharmaceuticals Inc: ‘the word ‘knowledge’ connotes more than subjective belief or unsupported speculation … [It] applies to any body of known facts or to any body of ideas inferred from such facts or accepted as truths on good grounds’.[25]
[24](2014) 88 ALJR 786 (‘Honeysett’).
[25]Ibid 790–1 [23] (emphasis in original) (citations omitted).
The applicant’s argument, before the judge and again on this application, was that a body of knowledge could not constitute ‘specialised knowledge’ for the purposes of s 79(1) unless it was shown to be ‘a reliable body of knowledge’, generally accepted as such in the forensic science community. The submission was founded on the following statement by Gaudron J in Velevski:[26]
The concept of ‘specialised knowledge’ imports knowledge of matters which are outside the knowledge or experience of ordinary persons and which ‘is sufficiently organised or recognised to be accepted as a reliable body of knowledge or experience’.
[26](2002) 76 ALJR 402, 416 [82] (citation omitted).
As can be seen, the phrase ‘a reliable body of knowledge’ as used here was itself part of a quotation. Gaudron J was here quoting from the well-known judgment of King CJ in R v Bonython,[27] a case decided under the common law concerning the admissibility of opinion evidence from a police handwriting expert. The issue in that case was whether defence counsel should have been permitted, on the voir dire, to investigate by cross-examination the method by which the expert had reached his opinion. It was contended that the trial judge had been required ‘to be satisfied as to the soundness of the methodology adopted before allowing the evidence to be given’.[28]
[27](1984) 38 SASR 45 (‘Bonython’).
[28]Ibid 46.
The South Australian Full Court rejected that contention. King CJ (with whom Matheson and Bollen JJ agreed) said:
Before admitting the opinion of a witness into evidence as expert testimony, the judge must consider and decide two questions. The first is whether the subject matter of the opinion falls within the class of subjects upon which expert testimony is permissible. This first question may be divided into two parts:
(a) whether the subject matter of the opinion is such that a person without instruction or experience in the area of knowledge or human experience would be able to form a sound judgment on the matter without the assistance of witnesses possessing special knowledge or experience in the area[;] and
(b)whether the subject matter of the opinion forms part of a body of knowledge or experience which is sufficiently organized or recognized to be accepted as a reliable body of knowledge or experience, a special acquaintance with which by the witness would render his opinion of assistance to the court. The second question is whether the witness has acquired by study or experience sufficient knowledge of the subject to render his opinion of value in resolving the issues before the court.
An investigation of the methods used by the witness in arriving at his opinion may be pertinent, in certain circumstances, to the answers to both the above questions. If the witness has made use of new or unfamiliar techniques or technology, the court may require to be satisfied that such techniques or technology have a sufficient scientific basis to render results arrived at by that means part of a field of knowledge which is a proper subject of expert evidence. Examples of cases in which that question arose are The Queen v Gilmore, The Queen v McHardie and Danielson and United States v Williams. An investigation of the methods adopted by a witness may be relevant to an assessment of his qualifications as a witness if such an investigation might reveal that the witness has ‘posing as an expert made assertions that are contrary to proved scientific facts or to the known phenomena of nature, thus exposing his ignorance of the learning he professed’ … or that the witness has adopted methods which are so unscientific as to expose that ignorance.
…
Generally speaking, once the qualifications are established, the methodology will be relevant to the weight of the evidence and not to the competence of the witness to express an opinion. The suitability and adequacy of the methods used may well be themselves a matter of expert opinion.[29]
[29]Ibid 46–7 (emphasis added) (citations omitted).
As Gaudron J noted in Velevski, she had previously cited Bonython with approval in HG v The Queen[30] (decided under the Act) and in Osland v The Queen[31] (decided under the common law). In HG, her Honour had said:
There is no reason to think that the expression ‘specialised knowledge’ [in s 79(1)] gives rise to a test which is in any respect narrower or more restrictive than the position at common law.[32]
[30](1999) 197 CLR 414, 432 [58] (‘HG’).
[31](1998) 197 CLR 316, 336 [53] (joint judgment with Gummow J).
[32](1999) 197 CLR 414, 432 [58].
Several points should be made about the Bonython test. First, the question posed by King CJ is whether a body of knowledge ‘is sufficiently organised or recognised’ to be accepted as reliable. The criteria of reliability thus specified are directed at questions of form (‘sufficiently organised’) and accreditation (‘sufficiently recognised’). Satisfaction of either criterion would seemingly be sufficient for the body of knowledge ‘to be accepted as a reliable body of knowledge’. Thus, in HG, Gaudron J held that opinion evidence would be admissible under s 79(1) if it were shown that the relevant ‘body of knowledge’ was ‘sufficiently recognised’ to be accepted as reliable.
Secondly — and of particular relevance to the present application — is what King CJ said in relation to ‘new or unfamiliar techniques or technology’. In such a case, his Honour said, the judge may require to be satisfied:
that such techniques or technology have a sufficient scientific basis to render results arrived at by that means part of a field of knowledge which is a proper subject of expert evidence.[33]
[33]Bonython (1984) 38 SASR 45, 47 (emphasis added).
The applicant’s contention — that the phrase ‘specialised knowledge’ in s 79(1) necessarily imports a criterion of reliability — has strong academic support. For example, Professor Gary Edmond of the University of New South Wales has argued that:
Reference to specialised knowledge in s 79(1) requires evidence that something is known — that is, already known. It does not refer to things that could be known, to things that seem plausible, to something that might be exposed during a trial, or to what a jury might accept. Unreliable knowledge is oxymoronic. Specialised knowledge that is not demonstrably reliable is not knowledge. Similarly, things that are uncertain, speculative or not well supported do not constitute knowledge. With the emergence of wide-ranging, authoritative and unanswered critiques of the forensic sciences, the time is ripe for appellate courts to consider not only the form of the opinion, but also the more fundamental question of whether there is evidence of specialised knowledge. Where forensic science techniques have not been formally evaluated we do not know if they constitute ‘knowledge’.[34]
[34]Gary Edmond, ‘The Admissibility of Forensic Science and Medicine Evidence under the Uniform Evidence Law’ (2014) 38 Criminal Law Journal 136, 143 (emphasis added). See also Gary Edmond and Mehera San Roque, ‘Honeysett v The Queen: Forensic Science, “Specialised Knowledge” and the Uniform Evidence Law’ (2014) 36 Sydney Law Review 323; Prudence Buckland, ‘Honeysett v The Queen (2014) 311 ALR 320: Opinion Evidence and Reliability, A Sticking Point’ (2014) 35 Adelaide Law Review 449. See also Andrew Ligertwood and Gary Edmond, Australian Evidence: A Principled Approach to the Common Law and the Uniform Evidence Acts (LexisNexis Butterworths, 5th ed, 2010) 682 [7.54].
Professor Edmond’s advocacy of this construction of s 79(1) is accompanied by ‘a plea for greater rigour’ in the judicial assessment of expert scientific evidence.[35] He and others contend that ‘too much weak, speculative and unreliable opinion is allowed into criminal proceedings’.[36] Professor Edmond cites recent reports — by the United States National Academy of Sciences, the National Institute of Standards and Technology and the National Institute of Justice respectively — as having identified
serious epistemic and structural problems with many forensic science and medical techniques and practices … The reports from these reviews and inquiries place emphasis on the surprising lack of research (especially validation studies), the lack of disclosure, the failure to report uncertainties and error rates, the lack of standards and weak application of standards, misguided and misleading ways of expressing opinions, inattention to cognitive contamination (that is contextual bias), as well as the need for transparency, disclosure, more detailed reports and institutional separation of forensic analysts from law enforcement.[37]
We return to this important question below.[38]
[35]Gary Edmond, ‘The Admissibility of Forensic Science and Medicine Evidence under the Uniform Evidence Law’ (2014) 38 Criminal Law Journal 136, 137.
[36]Gary Edmond and Mehera San Roque, ‘Honeysett v The Queen: Forensic Science, “Specialised Knowledge” and the Uniform Evidence Law’ (2014) 36 Sydney Law Review 323, 324.
[37]Gary Edmond, ‘The Admissibility of Forensic Science and Medicine Evidence under the Uniform Evidence Law’ (2014) 38 Criminal Law Journal 136, 137.
[38]See [107]–[114] below.
As the applicant acknowledged, the only appellate court which has directly addressed this question of construction concluded that reliability fell outside the scope of s 79(1). In R v Tang,[39] the New South Wales Court of Criminal Appeal was concerned with the admissibility of expert opinion evidence about facial and bodily similarities between individuals, conventionally referred to as ‘facial mapping’ and ‘body mapping’ respectively. (The expert witness had been asked to compare photographs made from a videotape with photographs of the accused.)
[39](2006) 65 NSWLR 681 (‘Tang’).
The Court of Criminal Appeal held that the opinion evidence was inadmissible. It failed to satisfy either of the two ‘limbs’ of s 79(1), because:
·‘body mapping’ did not constitute ‘specialised knowledge’; and
·neither ‘facial mapping’ nor ‘body mapping’ was a body of knowledge on which the opinions purportedly given could be ‘based’.
As to ‘specialised knowledge’, Spigelman CJ (with whom Simpson and Adams JJ agreed) said:
With respect to the first limb of s 79, I have set out above the evidence of the nature of the specialised knowledge which Dr Sutisno said she had brought to bear in the formulation of the three opinions she expressed. There does appear to be a body of expertise based on facial identification. The detailed knowledge of anatomy which Dr Sutisno unquestionably had, together with her training, research and experience in the course of facial reconstruction supports her evidence of facial characteristics.
Nothing was presented to the Court which indicates, in any way, that Dr Sutisno’s extension from facial to body mapping, with respect to matters of posture, has anything like that level of background and support. Specialist knowledge of posture can of course exist … But the foundation for admissibility must be lain. It was not lain in the present case. The so-called ‘unique identifier’ of posture was an essential element of Dr Sutisno’s evidence of identity in the present case.
The focus of attention must be on the words ‘specialised knowledge’, not on the introduction of an extraneous idea such as ‘reliability’.
…
In the immediate context of ‘specialised knowledge’, picked up by the words ‘that knowledge’ in the second limb of s 79, the word ‘knowledge’ has a different connotation to that which it might have in a different context, for example, ‘common knowledge’. The meaning of ‘knowledge’ in s 79 is, in my opinion, the same as that identified in the reasons of the majority judgment in Daubert v Merrell Dow Pharmaceuticals Inc 509 US 579 (1993) at 590: ‘[T]he word “knowledge” connotes more than subjective belief or unsupported speculation. The term “applies to any body of known facts or to any body of ideas inferred from such facts or accepted as truths on good grounds”’. The quoted definition is from an American dictionary.
I do not mean to suggest that Daubert and its progeny in the United States has anything useful to say about s 79 of the Evidence Act. Rule 702 of the Federal Rules of Evidence (2004), which fell to be interpreted in Daubert, is in quite different terms to s 79. The definition of the word ‘knowledge’ in this cognate context is, however, instructive.
In the case of the appellant the relevant evidence about posture was expressed in terms of ‘upright posture of the upper torso’ or similar words. The only links to any form of ‘training, study or experience’ was the witnesses’ study of anatomy and some experience, entirely unspecified in terms of quality or extent, in comparing photographs for the purpose of comparing ‘posture’. The evidence in this trial did not disclose, and did not permit a finding, that Dr Sutisno’s evidence was based on a study of anatomy. That evidence barely, if at all, rose above a subjective belief and it did not, in my opinion, manifest anything of a ‘specialised’ character. It was not, in my opinion, shown to be ‘specialised knowledge’ within the meaning of s 79.[40]
[40]Ibid 712–13 [135]–[140] (emphasis added) (citations omitted).
In Honeysett, the High Court described the decision in Tang in these terms:[41]
In R v Tang, the Court of Criminal Appeal dealt with a challenge to evidence that a person depicted in still frames taken from a CCTV recording of a robbery was the same as a person depicted in a police photograph. The evidence in that case was given by Dr Sutisno. Her opinion took into account her assessment of the ‘relatively upright posture’ of the person in each of the images. The observation was an essential element of her opinion of identity.
Spigelman CJ (Simpson and Adams JJ concurring) cautioned against introducing an extraneous idea such as ‘reliability’ into the determination of admissibility under s 79(1). Importantly, his Honour laid emphasis on the requirement of knowledge by reference to the statement in Daubert set out earlier in these reasons.[42] The opinion that the individual displayed ‘relatively upright posture’ was not wholly or substantially based on Dr Sutisno's specialised knowledge of anatomy. His Honour found that it had not been established at the trial that the comparison of physical attributes – ‘body mapping’ – constituted an area of ‘specialised knowledge’ capable of supporting an opinion of identity.
[41](2014) 88 ALJR 786, 791 [26]–[27] (citations omitted).
[42]See [44] above.
Of central significance to the present application is the statement in Tang that
the focus of attention [under s 79(1)] must be on the words ‘specialised knowledge’, not on the introduction of an extraneous idea such as ‘reliability’.[43]
This statement was evidently prompted by the first ground of appeal, which contended that the opinion evidence was inadmissible because
it was not evidence of opinion based on [a] recognised and testable body of reliable scientific or specialised knowledge, training, study or experience …[44]
[43](2006) 65 NSWLR 681, 712 [137].
[44]Ibid 703 [82] (emphasis added).
As a matter of established principle, the construction of s 79(1) adopted in Tang — treating questions of reliability as extraneous — should be followed unless we were persuaded that it was plainly wrong.[45] (As appears from the extract set out above, the High Court in Honeysett referred — with apparent approval — to Spigelman CJ as having ‘cautioned against introducing an extraneous idea such as “reliability” into the determination of admissibility under s 79(1)’.)
[45]DPP v Patrick Stevedores Holdings Pty Ltd [2012] VSCA 300, [111].
The judgments in both Tang and Honeysett concluded that the word ‘knowledge’ in s 79(1) had the meaning attributed to it by the United States Supreme Court in Daubert v Merrell Dow Pharmaceuticals Inc.[46] As will appear, however, the Supreme Court in that case was principally concerned with the judicial assessment (under the Federal Rules of Evidence) of the reliability of scientific opinion evidence.
[46]509 US 579 (1993) (‘Daubert’).
At issue in Daubert was the admissibility of scientific evidence directed at assessing the likelihood that a particular pregnancy drug had caused birth defects. Up to that point, the ‘dominant standard’ for determining the admissibility of novel scientific evidence had been the so-called ‘general acceptance’ test, laid down 70 years earlier by the then Court of Appeals for the District of Columbia in Frye v United States.[47] The Frye test was formulated in these terms:
Just when a scientific principle or discovery crosses the line between the experimental and demonstrable stages is difficult to define. Somewhere in this twilight zone the evidential force of the principle must be recognized, and while courts will go a long way in admitting expert testimony deduced from a well-recognized scientific principle or discovery, the thing from which the deduction is made must be sufficiently established to have gained general acceptance in the particular field in which it belongs.[48]
[47]293 F 1013 (1923).
[48]Ibid 1014 (emphasis added).
In Daubert, the Supreme Court concluded that the Frye test had been superseded by r 702 of the Federal Rules of Evidence, which at that time provided as follows:
If scientific, technical, or other specialized knowledge will assist the trier of fact to understand the evidence or to determine a fact in issue, a witness qualified as an expert by knowledge, skill, experience, training, or education, may testify thereto in the form of an opinion or otherwise.[49]
[49]The rule was amended in 2000 to include a requirement that the testimony be ‘the product of reliable principles and methods’: Andrew Ligertwood and Gary Edmond, Australian Evidence: A Principled Approach to the Common Law and the Uniform Evidence Acts (LexisNexis Butterworths, 5th ed, 2010) 624 [7.52].
Rejection of the Frye test did not mean, however, that there were no limits on the admissibility of ‘purportedly scientific evidence’.[50] On the contrary, the Court said
the trial judge must ensure that any and all scientific testimony or evidence admitted is not only relevant, but reliable.
The primary locus of this obligation is Rule 702, which clearly contemplates some degree of regulation of the subjects and theories about which an expert may testify.
…
The subject of an expert’s testimony must be ‘scientific … knowledge’. The adjective ‘scientific’ implies a grounding in the methods and procedures of science. Similarly, the word ‘knowledge’ connotes more than subjective belief or unsupported speculation. The term ‘applies to any body of known facts or to any body of ideas inferred from such facts or accepted as truths on good grounds’ … Of course, it would be unreasonable to conclude that the subject of scientific testimony must be ‘known’ to a certainty; arguably, there are no certainties in science.
But, in order to qualify as ‘scientific knowledge’, an inference or assertion must be derived by the scientific method. Proposed testimony must be supported by appropriate validation — ie, ‘good grounds’, based on what is known. In short, the requirement that an expert’s testimony pertain to ‘scientific knowledge’ establishes a standard of evidentiary reliability.[51]
[50]509 US 579, 589 (1993).
[51]Ibid 589–90 (emphasis added) (citations omitted).
The judge’s consideration of admissibility would thus entail:
a preliminary assessment of whether the reasoning or methodology underlining the testimony is scientifically valid and of whether that reasoning or methodology properly can be applied to the facts in issue.[52]
While eschewing a ‘definitive checklist or test’ to be applied in the making of this assessment, the Court proceeded to set out ‘some general observations’.[53]
[52]Ibid 592–3.
[53]Ibid 593.
The ‘overarching subject’ of the admissibility inquiry should be:
the scientific validity — and thus the evidentiary relevance and reliability — of the principles that underlie a proposed submission. The focus, of course, must be solely on principles and methodology, not on the conclusions that they generate.[54]
[54]Ibid 594–5.
The inquiry must be a ‘flexible one’,[55] the Court said, to which the following considerations will ordinarily be relevant:
[55]Ibid 594.
(p) whether the theory or technique can be — and has been — tested;
(q) whether the theory or technique has been subjected to peer review and publication; and
(r) the known or potential rate of error, and the existence and maintenance of standards controlling the technique’s operation.[56]
[56]Ibid 593–4.
And there might still be a role for a test of general acceptance:
A ‘reliability assessment does not require, although it does permit, explicit identification of a relevant scientific community and an express determination of a particular degree of acceptance within that community’ … Widespread acceptance can be an important factor in ruling particular evidence admissible, and ‘a known technique which has been able to attract only minimal support within the community’ … may properly be viewed with scepticism.[57]
(As Mullighan J pointed out subsequently in R v Karger,[58] the Frye test continues to be applied in the United States outside the federal system.)
[57]Ibid 594 (citations omitted).
[58](2001) 83 SASR 1, 43–4 [179].
Although the High Court in Honeysett[59] made no reference to the discussion of reliability in Daubert, the Court did draw attention to the decision of the New South Wales Court of Criminal Appeal in Morgan v The Queen.[60] In that case, as the High Court noted, the Court of Criminal Appeal had expressed concern about ‘the lack of research into the validity, reliability and error rate of the process [of body mapping]’.[61] Importantly for present purposes, however, the Court in Morgan held that the appellant’s challenge to admissibility did not require it ‘to examine the science of body mapping’.[62] Rather, the question was whether the expert witness
had specialised knowledge, beyond the reach of lay people, which he brought to bear in arriving at his opinion.[63]
[59](2014) 88 ALJR 786.
[60](2011) 215 A Crim R 33 (‘Morgan’).
[61]Ibid 59 [138].
[62]Ibid 60 [139].
[63]Ibid 59–60 [138]–[139].
Reference should also be made to the earlier decision of the New South Wales Supreme Court in R v McIntyre.[64] In that case, Bell J (sitting as a trial judge) was concerned with the admissibility of opinion evidence as to the results of DNA testing of various crime scene items. The DNA analysis was conducted by a means of the Profiler Plus system.[65] Having heard evidence and argument on the voir dire, her Honour concluded that the evidence was admissible under s 79(1).
[64][2001] NSWSC 311.
[65]See [17] above.
As her Honour noted, the basis of the defence objection to the admission of the evidence
went principally to the reliability of results obtained by means of the Profiler Plus system. It was submitted that material identifying the primer sequences supplied with the Profiler Plus kit is not publicly available since the manufacturers consider it to be commercially sensitive information. It flowed from this that the Profiler Plus system is not open to critical discussion among the scientific community.[66]
[66][2001] NSWSC 311, [3].
Her Honour ruled that she would not permit the voir dire to extend to a challenge
based upon a contention that the Profiler Plus system was not reliable in the sense that it had not received acceptance within the scientific community by reason of the non-publication of commercially sensitive material relating to the primers marketed as part of the kit.[67]
In her Honour’s view
the question of whether a field is one of ‘specialised knowledge’ for the purpose of s 79 of the Act does not require proof of the matters with which the Court was concerned in Daubert … which include proof of capacity for testing, actual testing, peer review, publication and the like.[68]
[67]Ibid [7].
[68]Ibid [14].
Consideration
In our respectful opinion, the conclusion reached by the New South Wales courts — first in McIntyre and then in Tang[69] — is correct. That is, the language of s 79(1) leaves no room for reading in a test of evidentiary reliability as a condition of admissibility.
[69](2006) 65 NSWLR 681.
As Gleeson CJ said in HG,[70] ‘it is the language of s 79(1) which has to be applied’.[71] The High Court has repeatedly emphasised that statutory interpretation begins, and ends, with the words which Parliament has used.[72] For it is through the statutory text that the legislature expresses, and communicates, its intention. Of course, the interpretation of a particular provision requires consideration of the legislative context and — where relevant — the legislative history. But if the words are clear and unambiguous, the provision must be given its ordinary and grammatical meaning.
[70](1999) 197 CLR 414.
[71]Ibid 427 [40] n 37.
[72]Thiess v Collector of Customs (2014) 250 CLR 664, 671 [22]. See also Baini v The Queen(2012) 246 CLR 469, 476 [14]; Legal Services Board v Gillespie-Jones (2013) 249 CLR 493, 509 [49], 511 [59].
The first condition of admissibility under s 79(1) is that the person who is to give the opinion evidence ‘has specialised knowledge’. As is apparent from what was said in Honeysett,[73] this phrase presents neither conceptual nor linguistic difficulty. Applying the Daubert formulation as approved in Honeysett, the focus of the inquiry will be on the witness’s ‘acquaintance with facts, truths or principles, as from study or investigation’. At the same time, under that formulation ‘knowledge’ is not confined to a body of facts but encompasses ‘ideas inferred from such facts … on good grounds’. On this view, the witness’s ‘specialised knowledge’ will encompass both the facts of which he/she has knowledge and the ‘ideas’ — inferences, hypotheses and theories — based on those facts.
[73]See [44] above.
In order for the knowledge to be ‘specialised’, the witness must be shown to have knowledge of the relevant subject matter
which is outside that of persons who have not by training, study or experience acquired an understanding of the subject matter.
The witness’s possession of knowledge having that specialist character is a question of fact.[74] In assessing the admissibility of the evidence, the judge must, of necessity, ascertain and define with some precision the scope, and the limits, of the witness’s ‘specialised knowledge’.[75] This must be so, given that the second condition of admissibility requires the opinion to be ‘wholly or substantially based’ on that knowledge.[76]
[74]Hamod v Suncorp Metway Insurance Limited [2006] NSWCA 243, [39].
[75]See Ocean Marine Mutual Insurance Association (Europe) OV v Jetopay Pty Ltd (2000) 120 FCR 146, 151 [22].
[76]Dasreef Pty Ltd v Hawchar (2011) 243 CLR 588, 604 [36]–[37].
What is to be made of the Daubert requirement that — in order to qualify as ‘knowledge’— an inference must be drawn ‘on good grounds’? This notion was further elaborated in Daubert, in a passage not quoted in either Tang or Honeysett, as follows:
[I]n order to qualify as ‘scientific knowledge’, an inference or assertion must be derived by the scientific method. Proposed testimony must be supported by appropriate validation ie ‘good grounds’ based on what is known. In short, the requirement that an expert’s testimony pertain to ‘scientific knowledge’ establishes a standard of evidentiary reliability.[77]
[77]Daubert 509 US 579, 590 (1993) (emphasis added).
As this passage makes clear, the Supreme Court was here construing the phrase ‘scientific knowledge’ as it appeared in r 702 of the Federal Rules of Evidence. The Court viewed ‘good grounds’ as being synonymous with the ‘appropriate validation’ required by ‘scientific method’. Section 79(1), by contrast, speaks of ‘knowledge’, not ‘scientific knowledge’. Unlike r 702, therefore, s 79(1) does not itself establish ‘a standard of evidentiary reliability.’[78]
[78]Cf Gary Edmond and Mehera San Roque, ‘Honeysett v The Queen: Forensic Science, “Specialised Knowledge” and the Uniform Evidence Law’ (2014) 36 Sydney Law Review 323, 332-3; see Amaba Pty Ltd v Booth [2010] NSWCA 344, [57]. It should be noted that the United States Supreme Court itself, in the subsequent decision in Kumho Tire Co v Carmichael, took a different view, holding that what had been said in Daubert about r 702 applied to all ‘scientific’, ‘technical’ and ‘other specialised’ matters within the scope of the rule: Kumho Tire Co v Carmichael 526 US 137, 147 (1999).
In our view, s 79(1) contains its own specification of the requisite foundation of the witness’s ‘knowledge’, namely, that the knowledge must be ‘based on the person’s training, study or experience’.[79] To take an example discussed in argument, a medical specialist with expertise in occupational lung disease may have come up with a new theory about the link between a particular form of lung disease and a particular industrial emission. Notwithstanding its novelty, the theory could properly be viewed as part of the expert’s ‘specialised knowledge’ provided that the theory was demonstrably based on ‘the person’s training, study or experience’. Once that was established, it would be no objection to admissibility that there was dispute in the relevant field about whether the theory was ‘correct’. Questions of reliability would fall for consideration separately, as discussed below.
[79]See Adler v Australian Securities and Investments Commission (2003) 179 FLR 7, 137–8 [629].
It follows, in our view, that a person’s knowledge may qualify as ‘specialised knowledge’ for the purposes of s 79(1) even if the area of knowledge is novel or the inferences drawn from the facts have not been tested, or accepted, by others. The position would have been different if, instead, s 79(1) had provided that an opinion was only admissible if shown to be based on a ‘reliable’ or ‘established’ body of knowledge. No such language was used, however, and the legislative history makes clear that this was a deliberate legislative choice.
In its 1985 Interim Report on Evidence, which preceded the enactment of the Uniform Evidence Acts, the Australian Law Reform Commission said:
[It has been suggested that] the expert must be able to point to a relevant accepted ‘field of expertise’ and the use of accepted theories and techniques. Quite what constitutes such a field remains a matter for speculation. There are major difficulties in implementing such a test … It is proposed, therefore, not to introduce the ‘field of expertise’ test. There will be available the general discretion to exclude evidence when it might be more prejudicial than probative, or tend to mislead or confuse the tribunal of fact. This could be used to exclude evidence that has not sufficiently emerged from the experimental to the demonstrable.[80]
[80]Australian Law Reform Commission, Evidence (Interim), Report No 26 (1985), [743] (emphasis added). This position was maintained in Australian Law Reform Commission, Evidence, Report No 38 (1987): [148]–[150].
This question was revisited in the 2005 joint report on Evidence by the Australian, New South Wales and Victorian Law Reform Commissions.[81] The joint report noted the ongoing debate at common law
as to whether and what extent the law should require the demonstration of a field of expertise or acceptance of a particular discipline … as a condition of admissibility of expert opinion on a matter. The uniform Evidence Acts do not contain any such express requirement …[82]
The report also noted the concerns expressed by the ALRC in 1985
about how the area of specialised knowledge should be identified, and at the possibility that the identified area of specialised knowledge might be tested by general acceptance or similar theories. [The ALRC] rejected identification of the area of specialised knowledge through application of a ‘general acceptance’ test or a ‘reputable body of opinion’ test of reliability because this was too strict, and would cause much useful and reliable evidence to be excluded. It would result in courts lagging behind advances in science and other learning.[83]
[81]Australian Law Reform Commission, Uniform Evidence Law, Report No 102 (2006).
[82]Ibid [9.31].
[83]Ibid [9.38].
The Commissions had sought comment on whether ‘significant problems’ were caused by the admission of expert evidence from novel scientific or technical fields and, if so, whether the Acts should be amended. The report continued:
In DP 69[84] the Commissions set out the responses received. They noted that most stakeholders consulted were reasonably satisfied with the way s 79 has been interpreted and applied. Having reviewed the responses, the Commissions stated the following views:
·that s 79 was not intended to enact, and does not enact, a ‘field of expertise’ test based on ‘general acceptance’ or similar requirements; and
·that the concerns as to probative value of evidence admitted under s 79, its potential to mislead, and the time and cost that have given rise to more stringent rules are best addressed by the discretion under s 135 for a court not to admit evidence in certain cases, and by the discretion under s 136 to limit the use which can be made of evidence by the tribunal of fact.
It was suggested that evaluation of new and developing areas of knowledge will continue to pose a challenge for the courts due to the nature of the exercise, and that adding new criteria to the uniform Evidence Acts would not simplify the task and might introduce new uncertainties.[85]
[84]Australian Law Reform Commission, Review of the Uniform Evidence Acts, Discussion Paper No 69 (2005).
[85]Ibid [9.40]–[9.41].
In conclusion, the joint report said:
The Commissions remain of the view that it is unnecessary to recommend an amendment to import any of the tests, such as the Frye test, that have been considered necessary at common law, or to clarify any aspects of the ‘specialised knowledge’ requirement of s 79.[86]
Unsurprisingly, therefore, when s 79(1) was enacted in Victoria in 2008, it was in exactly the same terms as s 79(1) of the 1995 Acts.
[86]Ibid [9.43].
Our conclusion about s 79(1) is a conclusion about statutory construction. It says nothing about the importance of the rigorous assessment of evidentiary reliability when expert opinion evidence is proposed to be called. As will appear, we view this as a matter of the first importance to the integrity and fairness of the criminal justice system. On our analysis, however, reliability falls for consideration under s 137, not under s 79(1).[87]
[87]See Jeremy Gans and Andrew Palmer, Uniform Evidence (Oxford University Press, 2nd ed, 2014) 147, 151.
The judge’s findings on ‘specialised knowledge’
Her Honour’s findings were as follows:
The knowledge of Ms Scott, Ms Federle and Dr Taylor about the statistical evaluation of DNA profiles is knowledge that involves acquaintance with a range of facts and principles concerning the nature of DNA profiles and their evaluation. Dr Taylor’s knowledge of statistics in general, and biological statistics in particular, is also based on known facts and recognised principles. These are plainly fields of specialised knowledge that satisfy the test in Daubert.
…
On the basis of the evidence at the preliminary hearing, I have concluded that knowledge concerning the statistical evaluation of DNA profiles, including by a fully continuous probabilistic system such as STRmix, is ‘specialised knowledge’ that could support opinion evidence in the form of likelihood ratios generated by STRmix. The DNA evidence proposed to be given by Ms Scott is admissible as opinion evidence under s 79(1) of the Evidence Act having regard to the supporting evidence of Ms Federle and Dr Taylor.[88]
[88]Reasons [46], [79].
As noted earlier, there was no challenge to these findings. This is not surprising. In our respectful opinion, they were plainly correct.
C. CONSIDERING RELIABILITY UNDER S 137
In Dupas v The Queen,[89] this Court concluded that, when considering the probative value of evidence for the purposes of the exclusionary rule in s 137 of the Act, the trial judge retained ‘the common law function of assessing reliability and weight’.[90] In relation to the common law discretion to exclude evidence, the Court said:
However the need was expressed and whatever the category of the evidence to which it applied, where the unfair prejudice was asserted to be the danger that the jury would attach undue weight to the impugned evidence, an evaluation of the weight of the probative evidence necessarily involved an assessment of the quality (and any inherent frailty) of that evidence. That is, the trial judge was required to form an opinion about the weight that a jury could reasonably assign to the evidence. Part of that task was to evaluate the quality, reliability and weight of the evidence. These terms have generally been treated as interchangeable in the present context.
Once an evaluation was made of the weight the jury might reasonably attach to the evidence, some assessment was then required of the nature and degree of the risk that the evidence might be misused for an improper purpose, or given undue weight. The likelihood of the risk eventuating, and its nature, would be balanced by the judge's view of the extent to which directions would ameliorate that risk. Once those matters had been assessed by the trial judge, the balancing exercise could be undertaken to determine whether the risk of prejudice was outweighed by the probative value of the evidence. Thus, where the probative value was significant and there was a low risk of the jury giving it greater weight than was warranted, or of it being used in an illegitimate way, the trial judge would not exclude the evidence. Conversely, if because of its unreliability the evidence had low probative value, yet there was a real risk that the jury would attach more weight to it than it deserved, and that risk could not be overcome by strong directions from the trial judge, the evidence would be excluded …[91]
[89](2012) 40 VR 182 (‘Dupas’).
[90]Ibid 224 [164].
[91]Ibid 219 [141]–[142] (emphasis added).
The Court in Dupas noted that expert evidence had
often been excluded in the exercise of the discretion where the trial judge was dissatisfied with its reliability and probative value.[92]
The Court continued:
Though modern attitudes towards expert evidence may be less exclusionary than in the past, it remains important — as Dawson J stated in Murphy v The Queen — to recognise the dangers of wrongly admitting it:
The admission of such evidence carries with it the implication that the jury are not equipped to decide the relevant issue without the aid of expert opinion and thus, if it is wrongly admitted, it is likely to divert them from their proper task which is to decide the matter for themselves using their own common sense. And even though most juries are not prone to pay undue deference to expert opinion, there is at least a danger that the manner of its presentation may, if it is wrongly admitted, give to it an authority which is not warranted.[93]
[92]Ibid 214 [125].
[93]Ibid 214 [126].
As we have seen, it was the view of the ALRC originally, and of the three Commissions in their joint report in 2005, that the reliability of expert evidence could adequately be addressed under s 137 (or s 135). As noted earlier, the applicant’s alternative argument was that the judge should have excluded the DNA opinion evidence under s 137. Applying Dupas, her Honour addressed the question of reliability in deciding whether to exclude the evidence under s 137.[94]
[94]See Gary Edmond, David Hamer, Andrew Ligertwood and Mehera San Roque, ‘Christie, Section 137 and Forensic Science Evidence (After Dupas v The Queen and R v XY)’ (2014) 40 Monash Law Review 389, 401, 411.
Before setting out her Honour’s analysis, it is necessary to address the more general question of how a judge is to assess the reliability of scientific evidence.
Assessing evidentiary reliability
As the Supreme Court of Canada said in 2007:
Evidence that is not sufficiently reliable is likely to undermine the fundamental fairness of the criminal process.[95]
The dangers of ‘junk science’ are obvious, as that Court had pointed out in an earlier decision:
Dressed up in scientific language which the jury does not easily understand and submitted through a witness of impressive antecedents, this evidence is apt to be accepted by the jury as being virtually infallible and as having more weight than it deserves.[96]
[95]R v Trochym [2007] 1 SCR 239, 260 [27] (‘Trochym’).
[96]R v Mohan [1994] 2 SCR 9, 21.
There have been numerous attempts over the years to develop tests of reliability for expert evidence. An important distinction which emerges is that between the reliability of what is referred to as the ‘underlying science’ and the reliability of the particular methodology or theory on which the expert’s opinion is based. The distinction can be illustrated by reference to the extensive materials, both judicial and scientific, which were placed before the Court regarding the evaluation of LTDNA profiles.[97]
[97]‘LTDNA’ means ‘low-template DNA’: see [21] above.
In 2008, a report commissioned by the UK Forensic Science Regulator (‘the Caddy Report’)[98] found the ‘underlying science’ of LTDNA evidence to be sound, while noting that there was at that stage no ‘legal and scientific consensus’ with respect to the analysis and interpretation of LTDNA profiles. A key question, the report said, was whether the processes involved in LTDNA analysis had been ‘adequately validated’.
[98]Brian Caddy, Adrian Linacre and Graham Taylor, ‘A Review of the Science of Low Template DNA Analysis’ (UK Home Office, 2008) (‘Caddy Report’).
The Caddy Report quoted with approval the following statement from Mr Justice Weir, whose criticisms of LTDNA evidence in the 2007 Omagh Bombing case had attracted widespread attention:
Validation is the process whereby the scientific community acquires the necessary information to:
·assess the ability of a procedure to obtain reliable results;
·determine the conditions under which such results can be obtained;
·define the limitations of the procedure.
The validation process identifies aspects of a procedure that are critical and must be carefully controlled. [99]
[99]R v Hoey [2007] NICC 49, [62].
The Caddy Report continued:
Because science is fundamentally an exoteric process [intelligible to outsiders], it is the norm in empirical science that findings and data are independently replicated prior to widespread acceptance. Lack of refutation is not sufficient of itself, regardless of the source of the original work.
…
To provide validation it is normal practice to begin with samples of known provenance and to submit them to the process and then to see how they comply with the expected outcome.[100]
[100]Caddy Report [3.14]–[3.15].
Following the Caddy Report, a further review of the interpretation of DNA evidence was commissioned by the Forensic Science Regulator in 2012. The aim of the report was said to be ‘to set out the basic principles to interpret DNA profiles’. The first of the proposed principles was in these terms:
Interpretation methodology should ideally be based on validated continuous probabilistic method(s), whether produced using an expert system, software or through manually based methods.[101]
[101]Peter Gill, June Guiness and Simon Iveson, ‘The interpretation of DNA evidence (including low-template DNA)’ (Home Office, 2012), 8 (emphasis added).
In 2010, the English Court of Appeal in R vReed[102] dismissed appeals against conviction where the prosecution had relied on LTDNA profiles at trial. The court was satisfied that ‘the underlying science’ for LTDNA analysis was sufficiently reliable to produce profiles.[103] That is, there was ‘a sufficiently reliable scientific basis’ for the evidence to be admitted.[104] Moreover, the court concluded, LTDNA could be used
to obtain profiles capable of reliable interpretation if the quantity of DNA that can be analysed is above the stochastic threshold.[105]
[102][2010] 1 Cr App R 23.
[103]Ibid [114].
[104]Ibid [111].
[105]Ibid [74].
A decade earlier, in R v Karger,[106] Mullighan J in the South Australian Supreme Court was concerned with the reliability of Profiler Plus, a methodology for interpreting DNA profiles to which reference has already been made.[107] (The issue in that case was governed by the common law). His Honour likewise approached the question of reliability at different levels. He first considered whether the methodology had been ‘accepted by the relevant scientific community as reliable and accurate for DNA analysis’.[108] His Honour then turned to the question whether the system had been ‘validated by men and women of science, found to be accurate and reliable and … accepted by them for use in the forensic context.’[109]
[106](2001) 83 SASR 1.
[107]See [17] and [67] above.
[108]R v Karger (2001) 83 SASR 1, [181]: see also [188], [189] and [228].
[109]Ibid [465].
His Honour quoted the following US guidelines in relation to validation:[110]
Validation is the process used by the scientific community to acquire the necessary information to assess the ability of a procedure to reliably obtain a desired result, determine the conditions under which such results can be obtained and determine the limitations of the procedure. The validation process identifies the critical aspects of a procedure which must be carefully controlled and monitored.
Validation studies must have been conducted by the DNA laboratory or scientific community prior to the adoption of a procedure by the DNA laboratory.[111]
Mullighan J concluded that, on the evidence before him, the methodology relied on had been adequately validated. The validation studies had been
conducted without error and produced results which enabled the Forensic Science Centre to establish appropriate standards and protocols, including threshold levels which permit accurate and reliable analysis and interpretation of results, including for example with low levels of DNA.[112]
[110]The guidelines were published by the body now known as the Scientific Working Group on DNA Analysis Methods (SWGDAM), which has published current, equivalent guidelines: Scientific Working Group on DNA Analysis Methods, Validation Guidelines for DNA Analysis Methods (2012).
[111]The wording of the corresponding National Association of Testing Agencies (NATA) standard was identical: R v Karger (2001) 83 SASR 1, 99 [140], [461].
[112]Ibid [543] (emphasis added).
An article published in 2011 described the validation of another interpretive methodology, known as ‘TrueAllele’.[113] The authors[114] said:
To demonstrate the reliability of DNA testing, forensic scientists conduct extensive validations of their STR data generation methods. Given the wide disparities found in DNA mixture interpretation results and the ongoing controversy surrounding mixture interpretation methods, clearly these methods should similarly be subject to scientific scrutiny.[115]
According to the article, the study had validated the TrueAllele genetic calculator for DNA mixture interpretation ‘using statistical measures of efficacy and reproducibility’.[116] (As counsel for the respondent pointed out, in 2012 the Superior Court of Pennsylvania accepted that TrueAllele had been ‘tested and validated in peer-reviewed studies’.)[117]
[113]Mark Perlin et al, ‘Validating TrueAllele® DNA Mixture Interpretation’ (2011) 56 Journal of Forensic Sciences 1430.
[114]Most of the authors were identified in the article as employees of the company which had developed the interpretive software.
[115]Mark Perlin et al, ‘Validating TrueAllele® DNA Mixture Interpretation’ (2011) 56 Journal of Forensic Sciences 1430, 1443.
[116]Ibid 1444.
[117]Commonwealth v Foley 38 A 3d 882, [9] (2012).
In a 2013 article,[118] David Balding of University College London commented that courts were ‘concerned about the reliability of the underlying science’ of LTDNA analysis. He argued that,
rather than the reliability of the science, courts and commentators should focus on the validity of the statistical methods of evaluation of the evidence.
It was put to Ms Federle that using weightings and abandoning thresholds had not been robustly debated in the broader scientific forensic DNA community. She responded that there had been a lot of literature about the different methods and, in the past, there had been recommendations to move to this sort of method, but there had not been the statistical programs to enable a method to be implemented. The continuous method was advocated for dealing with partial profiles and low level profiles, and the debate had proceeded on the basis that this was the way forward.
In my view, this evidence places STRmix and its fully-continuous probabilistic methodology squarely within the field of statistical DNA evaluation, albeit possibly at its leading edge. It has a credible scientific and mathematical basis. However, that is not to say that the results that it produces are inherently reliable. It may not produce results that are as reliable for the purposes of forensic case-work as binary or semi-continuous methods. Its method for determining the possibility or probability of drop-out based on peak height variability may be open to criticism. Its use on compromised profiles may be questionable. But those are matters for competing expert opinion, providing, of course, that STRmix is amenable to scrutiny and independent testing.[134]
[133]See [41] above.
[134]Reasons [48]–[58].
Her Honour then turned to deal with the amenability of STRmix to scrutiny and independent testing:
I observe in this context that while STRmix produces quite extensive reports (referred to as ‘output data’) that permit scrutiny of many of its operations, Dr Taylor gave evidence that this output data is incomplete for the reason that there are ‘hundreds of thousands of different data points and calculations’ and it is impossible to include them all in the output data due to their sheer number.
The STRmix Research Paper acknowledges that without an understanding of the underlying mathematics, there is a risk that systems become ‘black boxes’ the workings of which are not understood by the users and that presentation of any statistical analysis in court becomes problematic.
To this end, the STRmix Research Paper describes the mathematics underpinning STRmix along with the practical implementation of the mathematics and the means of calculating the likelihood ratio. It is an attempt by the developers of STRmix to expose the internal workings of STRmix by setting out the mathematical and statistical models on which it is based.
The STRmix Research Paper also deals with the issues of reproducibility and describes the validation experiments carried out by the authors. The results of the validation experiments are set out in the appendices. According to the authors, inspection of the results of these experiments suggests that the likelihood ratio assigned by the method is fair and reasonable.
In his evidence, Dr Taylor agreed that it is not feasible to validate a process for producing a likelihood ratio in the way that one might validate a procedure for measuring a physical quantity, because a likelihood ratio has no true value: it expresses uncertainty about an unknown event and depends on modelling assumptions that cannot be expressly verified. However, he said that progress can be made in evaluating the validity and performance of software, and that the courts need these kinds of evaluations to have confidence in the results generated by software-based forensic analysis.
Dr Taylor gave evidence that validation can be done at several levels. The STRmix Research Paper addresses the question of conceptual validation. Development validation of the software involves ascertaining whether the software does what the Research Paper says it does. Then there is laboratory-based validation, which is verification that the software is fit for purpose in the hands of scientists. That involves examination of interpretations of mixtures of known contributors and comparison against other methods and/or human judgment. The STRmix Research Paper details validation of both kinds, including validation studies using known contributors and comparison against other methods and human judgment.
In addition, the STRmix User’s Manual details the verification ‘by hand’ of a number of STRmix functions, including expected allele and stutter heights and expected peak heights of drop or ‘Q’ alleles.
VPFSS has carried out validation studies using known contributors and P+ and PP21 mixtures. The validation studies were tendered in court and Ms Federle and Ms Scott gave evidence about them at the preliminary hearing.
The PP21 study is dated April 2013 and tests with two person and three person mixtures. Six known individuals were used in different pairings and at different ratios. A total of 16 two person mixtures were amplified, producing 161 deconvolutions and 307 likelihood ratios. Ten three person mixtures were amplified, producing 124 deconvolutions and 371 likelihood ratios. In addition, three mocked partial profiles were analysed a number of times using STRmix.
The P+ study is dated May 2013. Again, tests were conducted with two person and three person mixtures and six known individuals were used in different pairings and at different ratios. A total of 12 two person mixtures were amplified, 60 deconvolutions performed and 120 likelihood ratios calculated. For the three person mixture study, 11 mixtures were amplified, 60 deconvolutions performed and 180 likelihood ratios calculated.
Both studies contained a variety of other testing and calibration, including the pilot studies for peak height variance using Model Maker required for STRmix.
All of these studies are open to scrutiny.
It is the defence position that the validation studies are inadequate because they did not test a sufficient number of samples and did not use the ratios that are in issue in this case.
In her evidence about these studies, Ms Federle described them as ‘large’ in the sense that each of the different mixture ratios was repeated over and over again to create a lot of data. The ratios used were intended to cover a broad range of different scenarios. According to Ms Federle, there was nothing in the studies to suggest that other mixture ratios would behave any differently from the study ratios. She said that none threw up any issues, so it was possible to extrapolate from them in respect of all sorts of mixtures.
For his part, Dr Taylor commented on the VPFSS testing with PP21 mixtures, observing that there was a reasonably substantial number of calculations and that contributors were combined in different proportions and in different amounts. When it was put to Dr Taylor that no validation study had been done for the particular ratios in the present case, Dr Taylor said that it was possible to extrapolate based on the kinds of ratios that were tested. The aim is to vary the total amount of DNA that goes into the mixtures and the proportions for each contributor to show mixtures in a range of configurations — a major-minor, two equal contributors, two equal contributors of very low amounts of DNA and so forth — to permit extrapolation from these results and to see that STRmix is performing as expected or as intended, based on these mixtures. He gave evidence that he himself had recently published a study in which he looked at two, three and four person mixtures at high and low levels and in different concentrations, and he concluded that STRmix was behaving as would be expected and could therefore be used on evidence profiles.
In cross-examination, Ms Federle was taken to the STRmix validation study that tested with P+ mixtures. She agreed in the two person mixtures, when both of the contributors contributed low levels of DNA, the likelihood ratio was determined to be less than one at four of markers, giving more support for the actual contributor not contributing to the DNA profile at four of the nine markers. It was put that the study did not validate the methodology. Ms Federle responded that this did not prove that the contributor was a non-contributor, but highlighted the issues with low-template DNA. According to Ms Federle, that sort of result was to be expected, given the input amount of the contributors to the profile: when there are really low contributions, STRmix will not calculate a likelihood ratio indicating a contribution that is contrary to the evidence. It will therefore give a likelihood ratio that favours the alternative view. That, says Ms Federle, is a totally reasonable explanation for the evidence.
Ms Federle denied that this showed that the validation group was too small and said it was just one of the effects of low-template DNA. It showed that STRmix was doing the correct thing and not overstating the evidence.
By contrast, Ms Taupin viewed the production of ‘false negatives’ of this kind by STRmix as an indication that it was unreliable.
In my view, this will be an issue for the experts at trial. I do not see it as showing that STRmix lacks an objective foundation or consider that it detracts from the cogency or reliability of the validation studies.[135]
[135]Ibid [60]–[78] (citations omitted).
Having reviewed the literature to which her Honour referred, and the evidence of the witnesses on the voir dire, we concluded that her findings were well open. If we may respectfully say so, her Honour discharged her ‘gatekeeper’ function[136] with exemplary care and thoroughness.
[136]R v J-LJ [2000] 2 SCR 600, 630 [61].
A particular question debated before this Court concerned the extent to which the results produced by STRmix were reproducible. It is acknowledged that the results are not precisely reproducible, for the reasons given by Dr Taylor in his 2014 article:
Continuous DNA interpretation methods … also have a level of non-reproducibility as Markov Chain Monte Carlo systems are based on random number generation and so the statistic calculated differs each time they are run. The variation between replicate biological analyses does not, however, make the results unreliable as long as that variability is taken into account within the biological and mathematical models used to interpret them. Examples of statistical models that take high levels of non-reproducibility into account are the construction of consensus profiles for low template DNA analyses. The reverse is also true; a completely reproducible result may be unreliable if the means in which it was generated are unreliable in some way.[137]
[137]Duncan Taylor, ‘Using Continuous DNA Interpretation Methods to Revisit Likelihood Ratio Behaviour’ (2014) 11 Forensic Science International: Genetics 144, 152.
Counsel for the respondent pointed out that the 2013 article by Dr Taylor and others showed that the statistical variances were vanishingly small.[138] We are satisfied that these variances do not affect the reliability of the methodology or of the likelihood ratios derived from its application.
[138]Duncan Taylor, Jo-Anne Bright and John Buckleton, ‘The Interpretation of Single Source and Mixed DNA Profiles’ (2013) 6 Forensic Science International: Genetics 516, 521.
Weighing up probative value and the danger of unfair prejudice
As her Honour correctly stated, she was required by s 137 to determine the weight which a jury acting reasonably could assign to the opinion evidence. Her Honour’s analysis was as follows:
The prosecution witnesses acknowledge that there are limitations in the STRmix methodology, a number of which are detailed in the STRmix Research Paper. The STRmix Research Paper describes limitations arising in some cases from conscious choice and in others from the current state of the model development. They include the danger that a large artefact is allowed through the manual review of the electropherogram and what is described as a ‘sub-optimal’ stutter model. It is also recognised that it is difficult to test continuous models for the accuracy of the likelihood ratio produced because the correct answer is unknown and in many cases unknowable.
Importantly, so far as I can tell, none of the prosecution witnesses contend that the forensic scientist can simply ignore the quality of the evidentiary profile and feed any kind of profile into STRmix to produce a reliable likelihood ratio. Although STRmix purports to be specifically suited for the analysis of low-template and partial DNA profiles, there is a need to carefully consider the profile and whether it is suitable for statistical evaluation for use in legal proceedings.
I have found the STRmix methodology to be a development in a recognised field of knowledge concerned with the statistical evaluation of DNA profiles. It is not subjective or speculative or otherwise to be dismissed as lacking an objective basis. In my view, based on the opinions of Dr Taylor, Ms Federle and Ms Scott the limitations identified do not substantially erode the probative worth of the DNA evidence.
However, it is clear that there is scope for competing expert evidence about the reliability of the STRmix methodology. What a lack of international take-up or independent review and assessment of the STRmix methodology means for the weight that should be given to the DNA evidence will be a matter for the jury, based on differing expert evidence on this issue. Likewise, the suitability of STRmix (or any system based on peak height modelling) for the analysis of low-template DNA will be the subject of disagreement between experts upon which the jury will be called to adjudicate. As to the specific problems identified by Ms Taupin in the VPFSS analysis of the Items, notably whether the profile for Item 1-3 shows three or four contributors and whether the accused is excluded as a contributor to Item 4-1 because the PP21 profile does not show a Y allele at the amelogenin marker, these again are matters that can and should be resolved by a jury hearing the expert evidence and deciding which evidence is to be preferred.
Her Honour concluded as follows:
In my view, the DNA evidence viewed as a whole is highly probative. It may be used by a jury to put the accused both inside and outside the house on the night in question. This is so, notwithstanding only small amounts of DNA matching that of the accused were found on the relevant items inside the house, and that other people also contributed to the DNA found on these items. The limitations in the STRmix methodology acknowledged by the prosecution witnesses must have some effect on the quality of the DNA evidence. However, I am not persuaded that they erode its probative value to any significant degree. Whilst the amounts of DNA may be small in some cases, the fact that DNA matching the accused’s was found on a number of items both inside and outside the house in my view fortifies the overall probative value of the DNA evidence, which I assess to be high.
In our view, that conclusion was well open on the evidence before her Honour. As to the danger of unfair prejudice, her Honour addressed two issues, one relating to a particular DNA sample, the other relating to the complexity of the DNA evidence generally. Her Honour described the first issue in these terms:
In this case, the danger of unfair prejudice is said to arise from a particular issue identified by Ms Taupin in the STRmix analysis of Item 1-2, although it has wider consequences as it is a product of the way in which STRmix works generally. Ms Taupin identified, having closely examined the STRmix case-notes, that at two of the 10 markers the probability of the evidence given the prosecution hypothesis was very very low, yet the likelihood ratios for the markers favoured the prosecution hypothesis. Ms Taupin pointed out that this means that STRmix produces likelihood ratios strongly favouring the prosecution hypothesis in circumstances where there is only very weak evidence to support that hypothesis. That, in combination with the very high likelihood ratios generated by STRmix, is said to be unfairly prejudicial to the accused and not something that should be allowed in a criminal trial.
Her Honour was not persuaded that this constituted unfair prejudice:
The likelihood ratios for the two markers in question favoured the prosecution hypothesis simply because the probability of the evidence given the prosecution hypothesis was higher than the probability of the evidence given the alternative hypothesis. The probability of the evidence given the alternative hypothesis was also very very low, even though it allowed for any other ramdomly selected member of the Australian Caucasian population to have contributed the DNA found on the exhibit rather than the accused. Furthermore, the likelihood ratios for individual markers are not the likelihood ratios that will be considered by the jury. The likelihood ratios sought to be relied on by the prosecution are the product of the likelihood ratios at each of the markers for the relevant Item.
In my view, this feature of STRmix does not give rise to unfair prejudice when considered alone or in the context of the DNA evidence as a whole. It is simply a function of the way in which likelihood ratios are calculated by STRmix.
Her Honour then turned to considering whether the DNA evidence might mislead or confuse jurors:
The mere fact that expert evidence deals with difficult and highly technical subject-matter does not, in itself, constitute unfair prejudice to the accused. However, if the DNA evidence were simply too complex for a jury to comprehend beyond its conclusions (the very large likelihood ratios), such incomprehensible complexity might amount to a very real prejudice. It is important that the jury understand the testing of the conclusions in cross-examination so that they do not misjudge the weight to be given to the likelihood ratios or adopt an illegitimate form of reasoning regarding their significance based on the appearance of scientific credibility.
In my view, the danger of unfair prejudice of this kind will not arise if the DNA evidence is led logically and sequentially by the prosecution and the jury is given proper assistance. It may be necessary to resort to visual and other aids, and Dr Taylor (or his equivalent) will be required to explain in very clear terms the mathematical and biological models upon which the STRmix methodology is based, and answer questions arising from its application in this particular case, such as (potentially) how weightings for particular genotype sets were arrived at and how allele frequencies were identified and used in the likelihood ratios.
Furthermore, any perceived prejudice that might arise from the complexity of the evidence (and, if relevant, the limited nature of the output data) could be addressed by a strong direction that the jury must not act on the STRmix conclusions unless they are wholly satisfied that they are soundly based in the evidence that they have heard and understood.
We respectfully agree. Her Honour’s conclusion accords with views previously expressed by this Court about the capacity of a jury to decide between competing expert opinions. In R v Juric,[139] Winneke P, Charles and Chernov JJA in a joint judgment observed that DNA evidence will not become inadmissible because experts express differing opinions upon the factual or scientific bases in respect of which their opinions are expressed.[140] The Court noted that, at the trial, there had been
a factual or scientific basis provided by each of the experts for the competing opinions given, a factual basis which was accessible to the jury and upon which they could come to a rational conclusion for preferring one opinion over the other … Full explanations were given by the experts for their competing views and his Honour in our view was correct to conclude that those were matters ‘accessible’ to the jury and capable of being resolved by them.[141]
[139](2002) 4 VR 411 (‘Juric’).
[140]Ibid 426.
[141]Ibid 428.
Mr Juric was subsequently retried before Nettle J. In a pre-trial ruling, his Honour said that, although the techniques and mechanisms underpinning the DNA evidence were ‘well beyond the ordinary experience of most men and women’ and were complex and difficult, he did not consider that they ‘would be incomprehensible to any attentive lay person once explained as they have been to me in the course of the evidence’.[142]
[142]R v Juric [2003] VSC 382, [41].
In R v Berry,[143] this Court was again faced with a ground of appeal contending that it had been beyond the jury’s capacity to assess competing expert opinions on DNA samples. In rejecting the ground, the Court said:
The submission that her Honour erred in the exercise of her discretion in admitting the DNA evidence … because of the competing expert opinions, cannot be sustained. An examination of the detailed evidence adduced from the experts during the trial and the aids which were utilised to assist the jury in understanding their evidence reinforces the view held by the trial judge that this was not a case in which the jury were unlikely to have understood the probative value of the DNA evidence and attributed to it undeserved weight. It could not be said that this was a case in which the jury was unable to evaluate the DNA evidence or resolve the conflict between experts.[144]
[143](2007) 17 VR 153.
[144]Ibid 161 [44].
It is clear from what her Honour said in the ruling that she will be astute to ensure that the scientific evidence is presented ‘logically and sequentially’ and is explained to the jury ‘in very clear terms’.
---
43
19
0