Forensic genealogy is the emerging practice of utilizing genetic information from direct-to-consumer companies for identifying suspects or victims in criminal cases. As of July, 2019, the use of this practice has led to the discovery of over 40 suspects of murder and sexual assault.[1] The investigative power of forensic genealogy revolves around the use of open-source databases such as GEDMatch. Through GEDMatch, users are able to upload their genealogy results from direct-to-consumer companies in an effort to identify relatives.
This is possible through analysis of identity-by-descent (IBD) segments of DNA that indicate shared ancestors.[2] Data available in GEDMatch, which is composed of genetic profiles from approximately 1.2 million individuals, has proven capable of identifying a third cousin or closer in over 90% of the population.[3] This information, used in tandem with demographic identifiers like age, gender, and place of residence, is sufficient for identifying any person who has a third cousin or closer within an open-source database.
Law enforcement agencies have leveraged the access to open-source databases by uploading crime-scene genealogy data and inferencing relatives to potential suspects.[4][5][6] Family tree assembly and analysis of demographic identifiers is then carried out by genealogy experts. The company Parabon NanoLabs has spearheaded much of the effort to usher in the use of forensic genealogy as an investigative tool. In May 2019, the company claimed to be cracking cold cases at a rate of one per week while simultaneously working on hundreds of cases.[7]
The use of forensic genealogy has been central in numerous high-profile cases, namely in the identification and ultimate arrest of Joseph DeAngelo, the Golden State Killer.[4] Despite its apparent success, the growing use of open-source databases by law enforcement agencies has not avoided serious scrutiny. A year prior to the arrest of DeAngelo, an individual was wrongly identified as a suspect in the murder of Annie Dodge, an 18-year-old woman who was the victim of a 1996 murder in Idaho Falls, Idaho. Michael Usry was the subject of a police investigation that led to a court order requiring Ancestry.com to disclose the identity of a partial match to crime scene DNA.[8] This partial match was Usry, who was ultimately cleared as a suspect after police secured a warrant for his DNA. This DNA test proved that he was not a full match to the perpetrator.
The free rein of open-source databases by investigators has initiated a debate over the Fourth Amendment implications of genealogy data. The Fourth Amendment states that a warrant is required in situations that violate an individual's reasonable expectations of privacy.[9] Given the sensitivity of information within direct-to-consumer genealogy databases, particularly concerning medical traits, behavioral tendencies, ethnic background, and familial associations, courts have asserted that they are subject to protection under the Fourth Amendment.[10][examples needed]
Currently, direct-to-consumer companies do not promise complete protection of user data. 23andMe, a leading consumer genealogy company, states in its privacy policy that “23andMe will preserve and disclose any and all information to law enforcement agencies or others if required to do so by law or in the good faith belief that such preservation or disclosure is reasonably necessary to…comply with legal or regulatory process”.[11]
In an effort to remain transparent to its consumers, 23andMe has a quarterly Transparency Report. This report identifies the number of government requests for user data in addition to the number of times data has been produced without the explicit consent of the individual(s) of interest. 23andMe claims to have never produced user data without consent.[12] The other industry leader, Ancestry.com, takes an analogous stance on the privacy of user data and similarly provides an annual transparency report.[13]
The direct-to-consumer genealogy company FamilyTreeDNA faced a backlash following an admission that they were working secretly with the FBI. This partnership was initiated in 2018 and had the goal of solving cold cases involving murder and rape.[14] Following scrutiny, FamilyTreeDNA's president Bennett Greenspan apologized for a lack of transparency, stating “I am genuinely sorry for not having handled our communications with you as we should have”.[14]
Privacy implications pertaining to open-source databases like GEDMatch are distinct from direct-to-consumer companies. As users voluntarily upload their genealogy profiles to GEDMatch, they forfeit their privacy to the data. The third-party doctrine, originally established by the Supreme Court, states that a person “has no legitimate expectation of privacy in information…voluntarily turn[ed] over to third parties”.[15] However, following intense media attention after the arrest of the Golden State Killer, GEDMatch changed their terms of service to require individuals to opt into use of their profiles by third parties.[16] In effect, privacy rights were shifted back into the hands of the users.
The government's own Combined DNA Index System (CODIS) database is composed of forensic evidence assessible to local, state, and federal law enforcement officials. This database consists of genetic profiles of approximately 18 million different people, however these are limited to DNA samples from convicted felons and arrestees.[17] Data on the racial distribution of profiles suggests that 8.6% of the entire African American population is present in the database compared to only 2% of the white population.[10]
On the other hand, genetic profiles from direct-to-consumer databases and GEDMatch consist of 75% white individuals from Northern European descent.[2] The vast overrepresentation of African American individuals within the CODIS database has rendered it relatively ineffective for solving serial murder and sexual assault cases, of which the majority of perpetrators are white. Based on data from 4,700 mass murderers, 57% of serial killers are white whereas 29% are African American. An over representation considering they make up 14% of the population. [18]