I start with the proposition, all too well established, that what passes for forensic science is, in too many instances, neither science nor reliable evidence. The National Registry of Exonerations keeps a list of all people who, since 1989, have been judicially determined to be factually innocent of crimes of which they were wrongfully convicted, and the list is up to 2,400 people. Of these, almost one-quarter were convicted in cases that involved either false or misleading forensic science.

I think it is worth remembering that many of the first forms of so-called forensic science were originally developed, not as science, but as aids to police investigations. There was little pretense in the early days (and we are talking now of the late nineteenth century) that these tools were supposed to be “hard” science or anything of the kind. They were just useful tools that the police could use to identify helpful leads or corroborate hunches.

But beginning in the early twentieth century, prosecutors began to seek to introduce such “evidence” in court, and, for this purpose, they began to cloth it with all sorts of bells and whistles that made it look like science. So you had people who were often just ordinary police officers or lab technicians being portrayed as “forensic science specialists,” who, by use of their “disciplines,” could make determinations to a “reasonable degree of scientific certainty.” The latter phrase, in particular, became critical, and in many states forensic specialists could not testify unless they were prepared to say that their conclusions were true to a reasonable degree of scientific certainty.

This was, to be frank, a bit bogus. The methodology was only semi-scientific at best and the conclusions reached were highly subjective and far from certain. Nevertheless, much of this early testimony was admitted without objection because defense counsel simply did not know anything about the area and so just let it go unchallenged.

This began to change somewhat with the advent, in the late 1980s, of the one really good form of forensic science, DNA testing. DNA testing was developed, not by police labs, but by hard scientists employing rigorous testing. Beginning in the late 1980s, however, it came to be used—and is still used—to help identify the guilty. But it soon proved to be a weapon as well for identifying false convictions and exonerating the innocent.

The great heroes of this development were Barry Scheck and Peter Neufeld at Cardozo Law School, who founded the Innocence Project in the late 1980s. Using DNA testing, the Innocence Project was able to show decisively that several hundred people previously convicted of such serious crimes as murder and rape were actually innocent. And in 40% of these very serious cases, other forms of so-called forensic science had been introduced at trial to establish guilt—incorrectly.

In the broader perspective, what DNA testing also showed was how weak and unreliable all the other forensic sciences were by comparison.

Chronologically, the next event of significance was the Daubert decision by the Supreme Court in 1993. Previously, the universal standard for the admission of scientific evidence was the Frye test developed by the D.C. Circuit in the beginning of the twentieth century. Frye held that, to be admissible, scientific evidence had to be “deduced from a well-recognized scientific principle or discovery, . . . sufficiently established to have gained general acceptance in the particular field in which it belongs.”

But while Frye itself, utilizing this test, held that polygraph (lie-detector) evidence was inadmissible, over time the test was watered down by interpreting “general acceptance in the particular field to which it belongs” to mean, inter alia, general acceptance among forensic practitioners. So, for example, since most fingerprint examiners were prepared to testify that theirs was a reliable methodology, the courts increasingly admitted fingerprint evidence without further inquiry, since it had “gained general acceptance in the particular field in which it belongs.”

Daubert, however, overruled Frye and prescribed that a judge had to play a much greater gatekeeping role before any kind of purportedly scientific evidence could be admitted. The judge was now required to look, not just at whether the evidence was the product of a method that was generally accepted, but also at whether that method had been tested, peer reviewed, had a low error rate, and the like. While none of these were absolute requirements for admissibility, what Daubert contemplated was that the judge would make a far more rigorous inquiry than had been the case under Frye. And while Daubert is only binding on the federal courts, it has proven sufficiently attractive that there are now thirty-eight states that have adopted either all or most of Daubert as their own standard.

So far, however, Daubert has made a much greater impact in civil cases than in criminal cases. While I am only familiar with the federal statistics, they show that Daubert challenges are not only made much more commonly in civil cases than in criminal cases, but also that they are much more successful in civil cases than in criminal cases. And my impression is that that is true of the state systems as well.

So why is this? As mentioned, one reason is that criminal defense lawyers do not know much about forensic science and, concomitantly, they do not know much about how to challenge it. Also, there is an economic aspect of this. In the civil cases, you often have well-heeled parties who can afford the best of experts. In the criminal system, most of the defendants are indigent, and while they get a free lawyer as of right, only in some states can they get a free expert. Even then, they have to specially apply for payment for the expert, and the judges will often impose budgetary restraints that will severely limit what the expert can do.

Still another factor is the unconscious bias of most judges in favor of the admission of this evidence. Many state judges presiding in state criminal courts are former prosecutors who used to introduce forensic evidence routinely – so to them it just seems natural. Also, in many states the criminal court judges are elected and cannot afford to be perceived to be “soft on crime” if they want to be reelected. But I also think that a big influence is that many judges are sensitive to the fact that juries these days expect the government to introduce forensic evidence, just as they see on television. Knowing this, some judges feel they are discriminating against the prosecutors if they do not allow in such evidence, because the jury will draw an adverse inference. So, whether for these or other reasons, it seems clear that, notwithstanding Daubert and the work of the Innocence Project, there have been few successful challenges to the admissibility of much forensic science.

Nevertheless, because of the doubts raised about much forensic science as a result of the DNA exonerations, Congress, in 2005, directed the National Academy of Sciences to study the problem. This led to a 328-page report, issued in 2009, entitled Strengthening Forensic Science in the United States: A Path Forward. The NAS report was highly critical of many previously accepted forensic techniques such as microscopic hair matching, bite mark matching, fiber matching, handwriting comparisons, toolmark analysis, shoeprint and tire-track analysis, blood-stain analysis, and so forth. And the common themes were that these techniques had not been the subject of rigorous scientific testing and were often highly subjective (and therefore prone to error).

Even fingerprint analysis, which, until DNA came along, was the “gold standard” of forensic science, was criticized by the NAS report as not being nearly as scientific as it purported to be. As an example, the NAS report pointed to the Brandon Mayfield incident. In 2004, a terrorist bombed a train in Madrid, killing and injuring many people. In their investigation, the Spanish police found a bag of detonators with at least one recoverable fingerprint on it and they sent the fingerprint to the various police fingerprint databases throughout the world, including that maintained by the FBI. Shortly thereafter, the FBI announced that the fingerprint was a perfect match for the fingerprint of an Oregon attorney named Brandon Mayfield.

The Spanish police at first were skeptical. But the FBI was so sure it was right that it sent one of their examiners to Spain to try to convince the Spanish experts that it was Mayfield’s print. The FBI also undertook 24-hour/‌day surveillance of Mr. Mayfield, and when they got the impression he might flee, they went to a judge and, on the basis of their fingerprint comparison, got what is called a “material witness warrant” for Mayfield’s arrest and locked him up for a few weeks. While that was going on, they also got warrants to search his home, his office, and his cars.

Meanwhile, however, the Spanish police identified a man named Daoud as the likely suspect, and eventually he confessed to the bombing. A few days later, the FBI, after examining Daoud’s fingerprint, admitted its error and Mayfield—who had never actually been charged with any crime—was released from jail. But the Inspector General of the Department of Justice then undertook an investigation as to how the FBI could have gotten it so wrong. And that report concluded that among the factors that led the FBI fingerprint examiners to make their mistakes were “bias,” “circular reasoning,” and a reluctance to admit errors. So, you can see how subjective factors played such an important role even in such a prominent case as this one.

The NAS report, critical though it was of fingerprint matching, acknowledged that it was still a lot better than many other common forms of forensic evidence. And the NAS report was really devastating in its criticism of things like bite mark and toolmark matching, stating that they lacked “any meaningful scientific validation, determination of error rates, or reliability testing.”

The main recommendation of the NAS report was the creation of an independent National Institute of Forensic Science that would undertake rigorous testing of all forms of forensic evidence and set standards for their use in court and elsewhere. But the recommendation was opposed by all the special interests who you might think would oppose it, and it never really gained traction.

Nevertheless, in reaction to the NAS report, the Justice Department (to its credit), in collaboration with the Department of Commerce, created the National Commission on Forensic Science, which commenced its work in 2013. It consisted of 33 commissioners, who among them represented virtually every interest group concerned with forensic science. This included prosecutors, defense counsel, hard scientists, forensic science practitioners, lab directors, law professors, state court judges, and even one federal judge—me. The object was to arrive at recommendations to the Department of Justice that represented a consensus of all the various constituencies on the Commission, and, to this end, the commissioners early approved a requirement that no recommendation could be made without the approval of two-thirds of the commissioners. In any event, most of the Commission’s more than 40 recommendations were approved by three-quarters of the commissioners or more, and most of them were adopted by the Department of Justice.

It is well to remember, however, that the federal system is a very small part of the criminal justice system. Ninety percent or more of criminals are prosecuted in state courts. While the commissioners hoped that their recommendations, in addition to being adopted by the Department of Justice, would have a “trickle-down” effect on the states, it is not at all clear that this has happened.

By way of illustration, over 80% of the commissioners recommended to the Department of Justice that the phrase “to a reasonable degree of scientific certainty” should never be used by the Department’s forensic experts, as it was inherently misleading, conveyed to a judge and jury a degree of certainty that none of these forensic sciences really possessed, and was a phrase that no real scientist would ever use. The Department accepted this recommendation. But just this past summer, I had a hearing in a case involving a psychiatric issue, and the psychiatrist stated in his report that he had reached his conclusion “to a reasonable degree of scientific certainty.” Asked about this on the stand, he stated that he had no idea what this phrase meant, but that in prior cases in state court, he had been counseled by both prosecutors and defense counsel that he was required to include it, so he routinely did.

Toward the end of its term in 2017, the Commission was on the verge, with some difficulty, of addressing some of the tougher issues, such as the statistical issues involved in determining error rates, and accordingly a majority of the commissioners asked that the Commission’s term be extended. But this proposal was rejected by the Department of Justice, which said it would instead take steps internally to improve forensic evidence and its presentation. Given the Department’s overriding role as prosecutor in chief, one may be skeptical about its ability to be genuinely objective in approaching such a task.

Before concluding the chronology, let me mention one other relevant event. Since 2001, the President has had an advisory committee on scientific issues known as the President’s Council of Advisors on Science and Technology, or PCAST. Toward the very end of the Obama Administration, PCAST issued a report entitled Forensic Science in Criminal Courts: Ensuring Scientific Validity of Feature-Comparison Methods.

Like the NAS report, the PCAST report was highly critical of much forensic evidence, as well as of the way it continued to be used. With respect to microscopic hair analysis, for example, the report noted that “[s]tarting in 2012, the Department of Justice (DOJ) and FBI undertook an unprecedented review of testimony in more than 3,000 criminal cases involving microscopic hair analysis. Their initial results, released in 2015, showed that FBI examiners had provided scientifically invalid testimony in more than 95[%] of cases where that testimony was used to inculpate a defendant at trial.” Yet, despite this extraordinary admission by the FBI, microscopic hair analysis continues to be used in many state courts, where it has repeatedly been held to be admissible.

The PCAST report, taking account of the fact that Congress had not adopted the NAS report’s proposal for the creation of an independent institute to test and evaluate forensic science, recommended that the National Institute for Standards and Technology (NIST), an arm of the Department of Commerce, become more active in this area, since they would not be subject to the Department of Justice’s inherent conflicts. Since then, NIST has done some helpful work, but on a fairly narrow basis.

Nevertheless, I think there are some steps in the right direction that could be taken even now without creating much of a ruckus. The state forensic science labs could be made more independent of the police and prosecutor’s offices through separate funding and by developing an ethos of independence (this has already happened in Houston). They, as well as private forensic labs, could be subjected to more rigorous state and federal accreditation requirements. And, as suggested by the National Commission on Forensic Science, they could also be made subject to an ethics code. Finally, the testing by the forensic labs could be made blind—that is to say, free from any biasing information supplied by the police or other investigators.

In the courts, greater use could be made of court-appointed experts. And appellate courts could reduce the barriers to collateral review of criminal convictions in which doubtful forensic testimony played a role.

While I think each of these steps would be helpful, I do not want to overstate them. I still think that the most game-changing reform would be the creation of an independent national forensic institute, as recommended by the NAS report. But realistically, this is unlikely to happen in the near future.

So, where are we left? We are left with forensic techniques that in their origin were simply ways of improving investigations but that have now assumed a role in the criminal justice system that they cannot honestly support. Their results are portrayed to judges, juries, prosecutors, and defense counsel as having a degree of validity and reliability that is simply not the case. Maybe crime shows can live with that lie, but our criminal justice system should not.