In just a few years, public DNA databases have emerged as powerful tools for solving genetic mysteries. They’ve been used to locate long-lost relatives and help adopted children find their biological parents. They’re perhaps best known for helping cops solve cold cases, from IDing anonymous bodies to sniffing out criminal suspects. In the most famous example, the Golden State Killer was identified from a nearly 40-year-old DNA sample that linked him to the genetic profiles of his distant cousins, which had been posted on the website GEDMatch.
While these kinds of breakthroughs have proved the promise of investigative genetic genealogy, they’ve also raised serious privacy concerns — not just for people who have shared their genomic profiles, but for millions of people who have never even taken a DNA test.
GEDMatch and MyHeritage, two of the most popular DNA sites, currently have 1.4 million and 1.3 million members, respectively. However, the number of people whose identities might be traced through these databases is much larger.
“In theory, you could detect genetic relatedness between two genomes if they share ancestors within the past five generations,” explains Mine Su Erturk, a PhD student in operations, information, and technology at Stanford Graduate School of Business. In other words, your DNA links you to hundreds of distant cousins who share a small percentage of their genetic material with you but are otherwise perfect strangers. And, Erturk points out, “This also goes the other way for the next five generations: It affects your children and grandchildren who might not even be alive yet.”
Study co-author, Mine Su Erturk, is a 2021 SGF Lieberman Fellow.