Comparison: Significance of Hair Evidence

Introduction

Forensic hair examinations are most frequently conducted to assist investigation and prosecution of crimes of violence such as murders and sexual assaults. Although the probative value of hair comparison evidence is generally much lower than that of some other forms of forensic evidence such as finger-printsor DNA, it can still provide good corroborative evidence. Hair comparison can help establish associations between any combination of the following: accused, victim, crime scene or weapon.
Unlike blood, the mere presence of hair on an exhibit item is not usually by itself of any evidential value. A hair comparison with a sample of known origin must first be performed. After making a forensic hair comparison, the next (and most important) step is to assess its significance. This involves a two-stage process: using the results to develop a conclusion, and then interpreting that conclusion to form an expert opinion as expressed in a report or court testimony. This article begins with a general discussion on evaluating associative forensic science evidence. A discussion is then given of studies on probabilities and human hair comparison and the criticisms that have been made of them. Some other studies of the value of hair comparison evidence are then discussed followed by a discussion of report writing and court testimony.

Evaluating Associative Forensic Evidence

The fundamental question to consider when evaluating associative forensic science evidence is ‘What is the value of the evidence in establishing a particular association?’ This question is not as simple and straightforward as it might first appear. First, there is considerable controversy as to whether or not it is the role of a forensic scientist to answer this question in a court setting. However, even those who feel that it is not should acknowledge that consideration of such concepts helps to clarify thinking outside the court environment. Second, there are those who persuasively argue that evidential value can only be determined through Bayesian analysis, whereas others seem to find such analysis difficult to understand or accept. Finally, there are many components to the fundamental question such as:
1. What is the probability that the association was due to coincidence?
2. What is the probability that the association was due to examiner error?
3. What is the probability that there is an alternative explanation for the evidence such as secondary transfer, contamination or deliberate planting?
Discussion of the above noted controversies is beyond the scope of this article. At this point, it will simply be noted that from a Bayesian perspective, the important questions are: (a) what is the probability of the hair evidence if there was association; and (b) what is the probability of the hair evidence if there was no association, and that it is the ratio of these two probabilities that determines evidential value.
The concept of type I and type II errors can also be helpful in understanding evidential value. Note that there are two possible states of nature with regard to association. Either there was some form of association (denoted as A) or there was not (denoted as N). Ignoring inconclusive results, there are two possible outcomes of a forensic scientist’s examination: either the evidence indicated association (E) or it did not (IE). If the state of nature is Aand the forensic scientist gives an opinion indicating E, the forensic scientist is correct. Similarly, when the state of nature is N and the forensic scientist says E, he or she is also correct. However, if the state of nature is Aand the forensic scientist says II, a type I error or incorrect exclusion has occurred. If the state of nature is N and the forensic scientist has given an opinion indicating II, there was a type II error or incorrect association.
Type I and type II errors can be better understood through an analogy to a fire alarm. Type I errors correspond to the fire alarm not ringing when there is a fire. Type II errors correspond to the fire alarm ringing when there is no fire. Depending on the decision we are making and the constraints involved, knowledge of the probability of type I or type II errors, or both, can greatly assist in making the decision. With the fire alarm, a type I error would be more serious than a type II error; with hair comparison a type II error would be the more serious since it could result in wrongly incriminating evidence being presented against a suspect. (It is for this reason that hair examiners should set a level of discrimination that minimizes type II errors without incurring an unreasonable number of type I errors.)
In report writing and court testimony, once a questioned hair has been found to be consistent with a known sample, it is only the probability of type II errors that is important in evaluating the evidence, just as once a fire alarm has rung it is the probability of type II errors and not the probability of type I errors that influences our decision as to whether or not to leave the building. In attempting to determine the value of forensic hair comparison in establishing associations, we need to know the probability of type II errors due to coincidental matches, examiner errors and alternative explanations for the evidence.


Studies on Probabilities and Human Hair Comparison

Alay person might observe that there are quite a few differences in the gross appearance of people’s hair. Some people have long hair, some short, some have curly hair, some straight; some have dark-colored hair, some light; some people bleach and dye their hair, some do not. Alay person would not, however, have any idea of the intrapersonal variation in hairs and the large number of hair characteristics that can be observed microscopically and the number of variables each characteristic can have. Accordingly, a lay person would not have any intuitive feel for the average value of forensic hair comparison evidence. In an attempt to rectify this situation, Gaudette and Keeping conducted a study in which, with the aid of a card-coding system, 366 630 pair wise comparisons were made between 861 hairs from 100 individuals. Of these, nine pairs of hairs were found to be macro-scopically and microscopically indistinguishable. From this it was calculated that if a single scalp hair selected at random from individual A was found to be consistent with a single hair selected at random from individual B, the chance that the match was due to coincidence was about 9/366 630 or 1/40 500. If a single hair selected at random from A was found to be consistent with a representative known sample from B (consisting in the study of an average of about nine mutually dissimilar hairs), on average the chance of a coincidental match was 9 x 1/40 500 or about 1 in 4500.
In a similar study with pubic hairs, 101 368 comparisons were made of 454 hairs from 60 individuals. It was found that 16 pairs of hairs were macroscopically and microscopically indistinguishable. Therefore, if a single pubic hair selected at random from person A was found to be consistent with a single pubic hair selected at random from individual B, an estimate of the average probability of a coincidental match would be about 16/100 368 or 1/6336. If the single hair selected at random from A was found to be consistent with a known sample of pubic hairs (which in the study consisted of about eight mutually dissimilar hairs) from B, an estimate of the average probability of that one hair having originated from someone else would be 8 x 1/6336 or about 1 in 800. The greater likelihood of a coincidental match for pubic hair than for scalp hair may reflect the smaller variation in characteristics of pubic hairs throughout the population.
An interesting finding of the pubic hair study was that hairs from one individual were involved in three matching pairs of hairs while hairs from seven other individuals were involved in two matching pairs. This shows that certain hair types and certain individuals are more likely to be involved in coincidental hair matches than others.
The Gaudette and Keeping results refer to the situation where a single questioned hair is found to be consistent with a known sample. The finding of two or more questioned hairs to be consistent with the known sample will greatly reduce the probability of a coincidental match. Aprobability estimate cannot be obtained by simply multiplying 1 in 4500 by 1 in 4500, however, since independence cannot be assumed.
It should be emphasized that the Gaudette and Keeping probability results are average values made up of the sum total of all hair types – from unusual hairs (where probability of a coincidental match would be virtually 0) to hairs of average commonness (where the probability of a coincidental match would approximate 1/4500), to common featureless hairs (whose probability of a coincidental match would be considerably greater).
The Gaudette and Keeping results can provide a good estimate of the average value of hair comparison evidence in establishing associations when the following conditions are met.
1. The probability of examiner error is very low. (This condition should be met when a well-trained qualified examiner carefully conducts the examination.)
2. The probability of secondary transfer, contamination or deliberate planting of evidence is very low.
3. Caucasian hairs are involved.
Those using such probability estimates should take care, however. It is extremely important to word probability statements carefully.
The Gaudette and Keeping studies were criticized as it was claimed they contained defects in experimental design and improper statistical treatment of the data. These criticisms have been further rebutted. Although Gaudette and Keeping’s work has been criticized, no studies have been offered to refute the results, and it has not been claimed that hair comparison evidence is not good evidence.

Other Studies on the Value of Hair Comparison Evidence

Awide range of opinions as to the value of hair evidence has appeared in the literature. Some authors take a disparaging view of hair evidence. The following quotation is typical: ‘There is nothing about hair comparable to the specificity of fingerprints, and at best the probability of establishing identification from hair is perhaps no greater than the probability of determining identification using the ABO blood group system in blood smears.’ (Camps 1968). On the other hand, the following quotation is typical of those authors who consider hair comparison evidence to have a high value: ‘From research studies, it has been shown that hairs from two individuals are distinguishable and that no accidental or coincidental matches occurred, and would, therefore in actual casework be a relatively rare event.’ (Strauss 1983).
The generally prevailing view of the value of hair comparison evidence lies between these extremes. These two quotations are representative:
Through hair comparison it is presently only rarely possible to determine that a questioned hair did or did not originate from a particular person. In the vast majority of cases it can only be stated that a questioned hair is or is not consistent with having originated from a particular person. Accordingly, hair comparison evidence is generally only of value when used in conjunction with other evidence. (Gaudette 1985)
1) So far, a hair or hairs have not been shown to have any features exclusively confined to an individual; 2) Any indication of identity based on an examination of hair can therefore only be established in terms of probability; 3) The probability is increased, under certain circumstances, if all the characteristic elements are considered and is increased to an even greater extent when unusual features such as uncommon colours, disease, etc. are present. (Martin 1957)
Although a large number of individuals have expressed opinions as to the value of hair comparison evidence, actual research studies on the topic have been more limited. In addition to the work of Gaud-ette and Keeping, the following studies have been reported.
In 1940, Kirk reported that a group of his students were, without exception, able to match one questioned hair to the correct known sample in a group of 20, all of similar colour and from individuals of similar age.
In 1978, Gaudette discussed two additional experiments on the value of hair comparison evidence. In the first experiments, 100 randomly selected questioned hairs were compared in a blind trial to one known sample. This experiment was repeated three times with three trainees, each near the end of a one-year training period. Two of these trainees correctly chose the one and only hair that was consistent with the known sample. The third trainee first concluded that four of the questioned hairs were consistent with the known sample. After examining the hairs more closely and consulting with other examiners he was easily able to identify one of his choices as being incorrect, leaving three hairs he thought to be consistent with the known sample: the correct one and two others. When Gaudette examined the hairs, he stated that one of the two others could be eliminated but the remaining one was indistinguishable from hairs in the known sample. Another experienced examiner then studied the hairs and also concluded that one of the two others could be eliminated. This time, however, it was the one opposite to that picked by Gaudette! All examiners did agree that the correct hair was consistent with the known sample. The hairs that caused the type II errors in this experiment were common featureless hairs.
In the second experiment, Gaudette compared 100 known hair samples to one questioned hair. He repeated the experiment three times using different sets of hairs. Twice the one and only correct known sample was picked as being consistent with the questioned hair (i.e. no type I or type II errors were made). In the third trial, a common featureless hair was chosen as the questioned hair. This hair was found to be consistent with two of the known samples, the correct one and one other (i.e. one type II error was made).
Strauss in 1983 conducted a series of seven experiments in which 10 questioned hairs were compared to 10 known samples. In each of the seven experiments, the known and questioned hairs were selected by a neutral party from a hair pool so that different numbers of questioned hairs actually matched the known samples. Iach time Strauss correctly matched all questioned hairs to their correct known samples (i.e. no type I or type II errors were made).
Bisbing and Wolner reported on a study in 1984 whereby each of seven questioned hairs were compared to several known samples. The results are shown in Table 1. Hairs in this study were from twins, the majority of whom were below the age of six. The majority of the subjects were blond. Most of the hairs were common featureless types and cut samples were used, thereby reducing the number of comparative features.
Wickenheiser and Hepworth repeated the Gaud-ette and Keeping study in 1990, with experimental modifications designed to overcome some of the criticisms of the original study. Wickenheiser and Hepworth collected representative hair samples of at least 100 hairs from each of 97 Caucasian individuals, including some closely related people from several generations. They then selected 5-13 hairs from each sample as representative of the range of characteristics present. The principal variation from Gaudette and Keepings’ procedure was that they had an independent person randomly number the mutually dissimilar hairs and add 53 additional hairs randomly chosen from the original known samples of the 97 individuals. By including several duplicate hairs in the study, Wickenheiser and Hepworth ensured that if they encountered hairs they could not distinguish, they would not be biased by the knowledge that the hairs had to have originated from different sources.
With the assistance of a personal computer database to eliminate unnecessary microscopic comparisons of obviously dissimilar hairs (one of the authors still made 749 one to one microscopic comparisons, and the other author required 2006), they were able to conduct 431 985 pairwise hair comparisons. One author found seven pairs of hairs to be indistinguishable, and the other author found six. In all cases these matches were between duplicate hairs; neither examiner found any hairs from different individuals which coincidentally matched. Of the 53 duplicate hairs, 38 were found to be unique in that they had no matching hair in the known sample selected.

Table 1 Results of simulated forensic comparisons

Questioned Hair color Number of known Number of matches11 Comments
specimen specimens
number
Examiner 1 Examiner 2
1 Brown 10 1 0 Hairs differ
2 Blond 10 2 2 Known contained duplicates
3 Blond 5 0 0
4 Brown 5 0 0 Twin eliminated
5 Brown 7 0 0
6 Blond 7 1 0 Indistinguishable in liquid media
7 Blond 8 1 1

This study led to several interesting conclusions.
1. If a one to one microscopic match is found between two hairs, the chances of it being a coincidental match are remote;
2. As reflected in the differences between the two examiners with respect to the number of direct microscopic comparisons required, the classification of hairs varies greatly between examiners;
3. The classification of hair is inconsistent due to variations over time. This then made the sorting procedure susceptible to error;
4. Five to thirteen macroscopically selected hairs are frequently inadequate to represent a known sample. This is the reason they did not find more matches between duplicate hairs. This led the authors to conclude that experimental work aimed at determining the optimum composition of a representative known hair sample is warranted.
The only reported study of the significance of non-Caucasian hairs was by Lamb and Tucker in 1994. In connection with the investigation of a series of sexual assaults, they compared known samples from 118 Afro-Caribbean suspects to questioned hairs from three crime scenes. Because the samples were collected over a two-year period, a full range of characteristics (such as length) could not be used and the level of discrimination had to be downwardly adjusted. Nevertheless, they were able to eliminate 62% of the suspects through low power microscopic examination with incident light. Afurther 25% of suspects were eliminated by transmitted light microscopy at higher powers, leaving only 9% of suspects which could not be eliminated.
Apart from Gaudette’s pubic hair study, no study on the value of non-scalp human hair comparisons has been published.
Astudy of the significance of dog hair comparison was conducted in conjunction with a celebrated American murder case (State of Georgia v. Wayne Williams). Gaudette compared the hairs from the suspect’s German shepherd dog to hairs from 12 other German shepherd dogs. The hairs from the suspect’s dog were divided into ten types depending on color and whether they were guard or intermediate hairs. Iight of the twelve comparison dogs had no hairs matching any of these ten types. Three of the comparison dogs each had one type of hair that was macroscopically and microscopically indistinguishable from one type of hair on the suspect’s dog. The remaining dog had two hair types indistinguishable from the suspect’s dog. It should be noted that the 12 comparison dogs were not selected at random from the population of all dogs but were deliberately chosen so that their coats closely matched the suspect’s dog. If they had been randomly chosen, an even smaller number of coincidental matches would have been found.
In a blind study involving comparison of 15 questioned hairs to known hair samples obtained from 25 pure bred German shepherd dogs, no type II errors were made and 6 of the 15 questioned hairs were correctly assigned to their known sample of origin. In a later extension of this study, a comparison of 25 questioned hair samples of about 10 hairs each to known samples from 100 mixed breed and purebred dogs of various types resulted in all 25 being correctly assigned with no incorrect associations.
From these various studies the following can be concluded about the value of forensic hair comparison evidence.
1. With a few isolated exceptions, hairs are not unique to an individual. Accordingly, it is possible for type II errors due to coincidental matches to occur in forensic hair comparison.
2. Type II errors are a relatively rare event in forensic hair comparisons conducted carefully by qualified, well-trained examiners. Accordingly, hair comparison evidence is generally good corroborative evidence.
3. There are several factors which can increase or decrease the probability of type II errors in a given case. Accordingly, each case must be considered on its own merit.

Use of Frequency of Occurrence Data

Some forensic scientists have proposed setting up a computerized database of hair comparison characteristics which they would then use to state frequency of occurrence data in court. There are, however, many problems with such an approach. First, presentation of frequency data on its own can lead to a distorted picture of the value of evidence along with a false sense of exactness. Second, there is the difficulty of characterizing the hairs for a database. It requires examiners to adopt a check list approach rather than the more natural pattern recognition approach. Two hairs described as alike can be markedly different microscopically. Two examiners are likely to describe hairs in slightly different ways. The same examiner will even vary his or her description from day to day. And finally, setting up such a database would be extremely time consuming. Accordingly, results-oriented research (such as previously described studies) is much preferable to the database approach. This is not to suggest, however, that information from databases would not be valuable. On the contrary, it could be quite useful in helping examiners decide which characteristics, and combinations thereof, are unusual and which are common.

Report Writing and Court Testimony

On the basis of the results of an examination, the hair examiner must draw a conclusion which he or she then interprets in giving an expert opinion as to evidential value. Conclusions and expert opinions are given in report writing and court testimony. Exact wording of conclusions will depend on an examiner’s preferences and a laboratory’s policy. A symmetrical spectrum of conclusions such as the following is suggested. (Apositive conclusion is defined here as one drawn from a finding of similarity between a known sample and a questioned hair. A negative conclusion is one arising from a finding of dissimilarity.)
Strong positive: The questioned hairs originated from the same person as the known sample.
Normal positive: The questioned hairs are consistent with having originated from the same person as the known sample.
Inconclusive: No conclusion can be given as to whether the questioned and known hairs has a common origin.
Normal negative: The questioned hairs are not consistent with having originated from the same person as the known sample.
Strong negative: The questioned hairs could not have originated from the same person as the known sample.
The great majority of hair comparisons will result in normal positive or normal negative conclusions, with the other three being rarely encountered.
The normal positive and normal negative conclusions cover a wide range of evidential value. Accordingly, it is important that they be further interpreted in reports and court testimony. The examiner should first mention that hair comparison is not usually a positive means of personal identification. An estimate of the average value of forensic hair comparison evidence should then be given. This can be either based on personal experience or some of the previously described published studies. Factors weakening or strengthening the evidence in the particular case should then be mentioned. Some factors which can weaken hair evidence in a particular case are given in Table 2. Some factors which tend to strengthen normal positive hair comparison conclusions are given in

Table 2 Some factors which tend to weaken positive hair comparison conclusions

1. The presence of incomplete hairs.
2. Questioned hairs which are common featureless hairs.
3. Hairs of non-Caucasian racial origin.
4. A questioned hair found in conjunction with other unassociated hairs.
5. Known samples with large intra sample variation

Table 3 Some factors which tend to strengthen positive hair comparison conclusions

1. Two or more mutually dissimilar hairs found to be similar to a known sample.
2. Hairs with unusual characteristics.
3. Hairs found in unexpected places.
4. Two way transfer, for example, a victim’s hair found on an accused’s clothing and an accused’s hair found on the victim’s clothing.
5. Additional examinations

Table 4 Some factors which tend to weaken normal negative hair comparison conclusions

1. Deficiencies in the known sample.
(a) not enough hairs,
(b) not representive,
(c) contains incomplete hairs,
(d) large time difference between offence and procurement of known sample.
2. Incomplete questioned hairs.
3. Questioned hair has macroscopic characteristics close to those of the known sample.

Table 5 Some factors which tend to strengthen normal negative hair comparison conclusions

1. Known sample has more than the recommended number of hairs.
2. Known sample shows little intrasample variation.
3. Questioned hair has macroscopic and microscopic characteristics very dissimilar to those of the known sample.
4. Two or more questioned hairs found together in a clump are dissimilar to the known sample.
Table 3. Similarly, for negative conclusions, factors weakening them are given in Table 4 and factors strengthening them are given in Table 5.

Next post:

Previous post: