When it was launched in April 2003, the Human Genome Project helped revolutionize biomedical research by providing scientists a reference map that allowed them to analyze DNA sequences for genetic clues to the origins of a host of diseases.
Twenty years later, a team of researchers has created a new “pangenome” that fills in missing sequencing gaps from the original genome project and greatly expands the diversity of genomes represented.
The achievement is described in a paper published May 10 in the journal Nature, one of six papers on the pangenome project published simultaneously in Nature journals.
“The new reference information is much richer and improves our ability to analyze human genomes for the purposes of drug discovery, disease diagnosis, and genome-guided precision medicine,” said Ira Hall, a professor of genetics at Yale School of Medicine and director of the Yale Center for Genomic Health, who is one of five co-corresponding authors of the paper.
The project was conducted by the Human Pangenome Reference Consortium, a project funded by the National Human Genome Research Institute to sequence and assemble genomes from individuals from diverse populations. Relying on a single reference genome, the group says, creates reference biases and undermines discoveries of variants, associations between genes and disease, and the accuracy of genetic analyses.
The new pangenome incorporates complete DNA data collected from 47 individuals representing every continent with the exception of Antarctica. It not only incorporates more diverse genetic sequences but also adds 119 million base pairs — the building blocks of DNA — to the library of 3.2 billion previously known base pairs that make up the human genome.
The higher quality and depth of data in the pangenome reference will help clinicians better pinpoint potentially dangerous variants within an individual patient’s DNA, the researchers say.
“The inclusion of diverse human populations in this project ensures that the medical advances enabled by the pangenome will benefit all ancestry groups in an equitable way,” Hall said.
Eventually, the Human Pangenome Reference Consortium hopes to add complete genetic data on 350 individuals to its reference database.