August 10, 2022 | The International Organization for Standardization (ISO) has published Phenopackets, a standard initially developed by the Global Alliance for Genomics and Health (GA4GH) and championed at ISO under the Canadian Mirror Committee to ISO/TC215/SC1 Genomics informatics, and supported by the Standards Council of Canada.
“We finally have the very first standard for phenotype data available worldwide,” said University of Colorado professor Melissa Haendel in a press release. Haendel is a GA4GH contributor who launched the Phenopacket idea eight years ago. “Having this ISO standard will encourage software developers, infrastructure developers, healthcare systems to consider Phenopackets as a method for sharing patient-level information — securely and in a deidentified way — that can be useful for everything from rare to infectious diseases, and addressing many kinds of public health questions,” Haendel added.
The standard, “ISO 4454 Genomics informatics — Phenopackets: A format for phenotypic data exchange,” was published on 6 July 2022.
Phenopackets debuted at ISO thanks to the leadership of GA4GH and through the support of Canada’s National Member Body, the Standards Council of Canada (SCC), and its Innovation Initiative.
A “Phenopacket” is a packet of data — typically a file — that humans and computers can read. It describes a person’s phenotype.
The standard was approved by GA4GH in October 2019. Back then, Haendel predicted that phenopackets would help us better characterize phenotype characteristics: What’s not there? How are these traits linked to genomic data, family phenotype, etc.? When were phenotypes first observed and how did they change over time and with treatment? She hoped phenopackets would facilitate being able to collate more de-identified phenotypic data from around the web and query against them.
The goal is to order and structure the information. “If somebody gives you a piece of paper with a bunch of scribbled stuff and says, ‘Do research with that,’ you’re going to go, ‘Well I don’t know what that is!’ You have to read it, understand it, extract all the data, and make sense of it in your head. And that takes time,” said Julius Jacobsen, a bioinformatics software developer at Queen Mary University of London who co-leads the GA4GH team working on Phenopackets.
“But the Phenopacket provides a sense of how all the bits fit together, like a blank form. All someone has to do is fill in the pre-existing fields, and then they can give you a nice piece of structured information which anyone can understand,” said Jacobsen.
Canadian Drive Leads Phenopackets to ISO
A few months after the 2019 GA4GH approval, the newly-formed ISO Genomics Informatics subcommittee met in South Korea. The group chose Phenopackets as one of its very first standards to develop, working in tandem with GA4GH contributors updating the original version of the standard. (Phenopackets v2 was adopted for the ISO standard after being approved in February.)
To officially propose Phenopackets to ISO, GA4GH Work Stream manager Lindsay Smith, who is based at the Ontario Institute for Cancer Research in Toronto, collaborated with the Canadian Mirror Committee to ISO/TC215/SC1, with the support of SCC.
Through programs like its Innovation Initiative, SCC helps innovators to commercialize technologies and facilitates their participation on national and international standardization committees for the benefit of economic growth and the health and safety of Canadians.
“Finding ways to advance health technologies has been an important area of interest for the Innovation Initiative,” said Chantal Guay, CEO of SCC in the same press release. “Developing an ISO standard is key to aligning perspectives internationally and promoting shared health information across the world.”
Many rounds of reviews from ISO experts in Japan, India, Canada, the U.S., and Korea ensured that Phenopackets would work in diverse healthcare systems.
Transforming Common Diseases
“Asthma, inflammatory bowel disease, schizophrenia, and other complex conditions are unlikely to be one disease. But it’s been difficult to divide these diseases into groups that respond to specific treatments. One reason is because everybody uses their own formats, so you cannot combine data,” said Peter Robinson, a computational biologist at the Jackson Laboratory who co-leads the GA4GH Phenopackets development team.
“By using Phenopackets, we’ll be able to improve precision medicine for individuals by being able to compare and cluster patients based upon their individual characteristics,” he said.
The standard could also improve study and diagnosis of rare disease.
“There should be a tool for patients to share their information as Phenopackets,” said Haendel. “Right now, there are rare disease patients all over social media sharing free text that could be structured in such a way that we could mine it as data — for example, to identify patients who have the same condition around the world.”
Patient matchmaking would get easier with a database of cases described in the Phenopacket format.
“Many journals in human genetics are willing to consider cajoling or requiring authors to submit Phenopackets together with case reports. Usually if you find a new disease gene, you’ll describe ten patients, but none of that information is accessible at the patient level,” said Robinson.
In June, Robinson, Jacobsen, Haendel, Smith, and collaborators published an article in Nature Biotechnology (DOI: 10.1038/s41587-022-01357-4) outlining how Phenopackets lets researchers and clinicians exchange patient characteristics more effectively — and link those data to genomic information.
Connected Standards Improve Patient Care
While the Phenopacket schema is still available free of cost from GA4GH, ISO publication significantly broadens its reach. Beyond Japanese biobanks, databases like the widely-used BioSamples have already implemented Phenopackets. Electronic health record vendors and national health systems are considering the standard.
As an added benefit, any organization that adopts Phenopackets can easily link to other powerful clinical and research tools from the GA4GH Genomic Data Toolkit.
Going forward, there are plans to build Phenopackets into standards for sharing electronic health records, such as Fast Healthcare Interoperability Resources (FHIR) by the Health Level 7 (HL7) organization.
“Phenopackets was chosen as one of HL7’s Vulcan Accelerator projects. Accelerator projects try to improve how clinical studies are designed, conducted and reported by advancing the implementation of research-ready standards. A project to represent Phenopackets in the FHIR standard is underway to make sure that this schema — that’s now an ISO standard — can also be used in the context of HL7,” said Haendel.
Phenopackets may be the first GA4GH standard published by ISO, but it will not be the last. Currently, the ISO Genomics Informatics subcommittee is reviewing a proposed standard for genomic surveillance systems — such as the public health systems that track Covid-19 variants spreading around the world. GA4GH standards feature prominently within the requirements.
When standards development organizations align their work, everyone benefits.
“The ISO publication of Phenopackets exemplifies the benefits of standards coordination. When different standards-setting bodies collaborate, it amplifies the impact of all our standards. Truly global standards expand responsible data sharing and bring the benefits of precision medicine to more patients and their families,” said Peter Goodhand, Chief Executive Officer of GA4GH.