Just 20 years ago, some expected the human genome to hold all the answers. Instead, we accumulated an almost unlimited backlog of new questions – a direct consequence of the ever-growing interest in unlocking the power of the genome through an explosion in new data-based applications of DNA sequencing technologies. From transcriptomics to epigenomics, from portable sequencing devices to graph genome representations, the technological ecosystem around genomics keeps evolving, enabling scientists to deepen their understanding of biological mechanisms and functions.
In healthcare and life sciences, the adoption of genomics technologies has been a core focus for most of the last decade. Since 2013, Genomics England (500,000 genomes) has pioneered the introduction of a national scale infrastructure and services for clinical genomics, starting with the 100,000 genomes project. Even before DNA sequencing made it into national healthcare systems, microarray-based genetic testing had been proving the power of precision molecular diagnostics to improve and “personalize” a patient’s experience of care and help reduce the cost. Since 2008, Oncotype DX, a 21 gene expression-based assay has been used to screen over 900,000 patients, helping hundreds of thousands of women avoid unnecessary chemotherapy. Today, GRAIL’s Galleri Test is being developed to screen patients for more than 50 types of cancer, combining next-generation sequencing (NGS) and Machine Learning. NHS England recently agreed to evaluate the Galleri Test, starting with 140,000 patients. Indeed, early cancer detection and classification are essential to improve disease outcome and survival rates over time by allowing the design of focused and specific treatment, which the NHS has made a key aim of its long term plan.
Beyond oncology, Non-Invasive Prenatal Testing (NIPT) has seen a successful clinical application of genetic testing and includes Natera’s Panorama test, which has been used in the context of over three million pregnancies. Furthermore, companies like Illumina are already supplying NGS-based products for NIPT.
The various examples of national precision medicine initiatives (e.g., Genomics England, All of Us Research Program, or the Dubai Genomics Initiative) are creating excitement and anticipation around the introduction of genomics and DNA sequencing technologies in the clinic. However, this is only part of a grander vision – a Precision Medicine vision which is gaining ground across all continents, with up to multiple initiatives per country. In the words of the European Federation of Pharmaceutical Industries and Associations, “Precision medicine is a healthcare approach that utilizes molecular information, phenotypic and health data from patients to generate care insights to prevent or treat human disease resulting in improved health outcomes”. The National Institute of Health (NIH | leads the All of Us Research Program with one million genomes) describes precision medicine as “an innovative approach that takes into account individual differences in patients’ genes, environments, and lifestyles”.
Table 1: Examples of production and storage costs of large-scale, population genomics initiatives.
For cost calculation purpose the following assumptions have been made: (1) The cost of producing a single whole human genome is set at $1,000, which covers all steps from sample extraction to variant calling; (2) the cost of storing a whole genome is set at $10; and the research community is going to create two petabytes of data for each petabyte of whole genome sequenced, which means 1 PB becomes 3 PB of data. The bottom two rows list the equivalents to the U.S. population (334 million citizens highlighted in orange) and the European Union (448 million citizens highlighted in green).
“Precision medicine is a healthcare approach that utilizes molecular information, phenotypic and health data from patients to generate care insights to prevent or treat human disease resulting in improved health outcomes”.
One of the most ambitious national precision medicine initiatives, the Dubai Genomics Programme (sequencing millions of genomes), lists long-term benefits such as:
- Eradicate/contain diseases arising from genetic origins
- Pre-empting disease by counselling individuals at risk, allowing lifestyle changes
- Facilitate the adoption of Personalized Medicine – treat the patient, not just the disease
- Preventive/treatment protocols created and published – with the Dubai Genomics Programme leading in advanced medicine
- Breakthroughs in genetic research
- Happy and healthy society
As reflected in these high expectations, Precision Medicine holds the potential to help healthcare tackle its two greatest challenges: an aging population, and the growing burden of chronic, life-long diseases.
Yet, to ease pressure on public healthcare systems, and to translate our ambitions into concrete benefits for patients, national Precision Medicine roadmaps need to account for three key challenges:
- Data Access: the tension between patient privacy and innovation;
- Data Monetization: the tension between costs and value of patient data; and
- Data Scaling: the lack of population-scale infrastructure and standards.
There is no universally accepted definition of Precision Medicine, but most interpretations recall the “4Ps”: Personalized, Predictive, Preventative, and Participatory. The first three “Ps” were conceptualized back in 2004 by Weston and Hood, However, the fourth “P” is possibly the most important. Participatory: A Precision Medicine approach cannot exist in practice without the active involvement and contribution of patients. The Genome UK strategy is very clear regarding this aspect: any innovation in healthcare based on patient data must deal with the patients’ consent with the public opinion. In the last decade we have already encountered examples of genomics projects which enjoyed great control over patient data but struggled to maintain their relationship with the public. For the long-term transformation of healthcare systems, rules and boundaries on patient data usage need to be established early on, in alignment with each country’s culture and situation.
The public is a key stakeholder of the introduction of precision medicine approaches in healthcare. Patients and taxpayers’ support is key to achieve the transformation of national healthcare systems, to enable more tailored clinical trials, and to maintain repositories which can accelerate the creation of better and more accurate diagnostic and preventive solutions.
The availability of population health data is the foundation of the future of healthcare and life sciences. However, not all genomes are born equal: the demand for patient information can vary greatly, depending on the scientific question, the therapeutic area studied, the family history, and other factors. Within the Precision Medicine landscape, organizations can be identified as data suppliers or data consumers based on their roles and needs (see Figure 1).
Figure 1: The Precision Medicine Landscape (with GP being General Practitioner, NPO being non-profit organization, and ISV being Independent Software Vendor).
The different stakeholders of the Precision Medicine landscape have different needs. Data suppliers need to maintain population-scale repositories for the long term, while securely making the data available for research. Data consumers need to obtain and maintain access to multiple third party repositories, which might come with different consent models and data schemas.
National genomic initiatives are key data suppliers, tasked by governments to generate and manage population-scale datasets for hundreds of thousands of patients. In the UK, Our Future Health is creating a repository to manage the data of five million NHS patients. That’s ten times the size of the UK Biobank. If all those five million patients had their whole genome sequenced, the UK government would need to invest billions of dollars in data generation alone. As such the private sector plays a key role in determining the value of investing in population-scale sequencing efforts. There are many reports of pharma companies investing unprecedented amounts to access data (genomics and meta data) from organizations like Genuity Science, 23andMe or UK Biobank. The real value of the data can only be understood by the buyer. This is why so many data owners are offering not mere data licensing agreements, but “co-development” programs. The direct visibility into data utilization helps bring clarity on the correct pricing strategy for government-sponsored datasets, and this trend is only expected to become more pronounced in the near future.
“genomic data in a void is sterile”
More transparency into data utilization is going to help ease the tensions on both data access and data monetization. However, even with modern regulatory frameworks and strong alignment between the public and private sectors, access to clinical data won’t produce the desired benefit without the right infrastructure behind it. Genomic datasets already bring a major challenge for data storage: today’s petabytes of data are growing into the exabytes over the next decades. But the storage footprint of human genome data is only part of the challenge.
Pharma and biotech companies have been welcoming the availability of larger and better characterized datasets, for the purpose of stratifying and filtering patients for clinical trials and drug development. However, to quote John Bell, UK government lead for Life Sciences strategy, “genomic data in a void is sterile”. A patient’s profile is made from much more than genomic data. “Holomic” datasets include demographic, socio-economic, histological, radiomic, proteomic and phenotypic information, on top of data gathered from IoT and mobile applications.
“Genomics is a fundamental part of precision medicine, but only one part.”
Artificial Intelligence (AI) is increasingly necessary to extract value from multi-modal datasets, identifying patterns within patient profiles and entire cohorts, and turning complexity into diagnoses. However, implementing AI and machine learning can constitute another challenge in itself: datasets can be incomplete and inconsistent based on different data schemas. Moreover, many datasets are effectively siloed and difficult to access for research collaborations.
These silos, or “walled gardens” are part of an old paradigm: the concept that data monetization would come from licensing access to an environment providing exclusive access to the data. Today, the increasing drive for international research collaborations has led to a paradigm shift towards federated research platforms, where data can be made available to international researchers without the need for copying around entire snapshots of the data.
The need for federated research environments has also been expressed by Genome UK, with a minimum of one federated research platform per nation. Organizations like the Global Alliance for Genomics and Health and the ELIXIR consortium in Europe have been leading international efforts to introduce new standards to enable federated clinical research across national borders. To achieve this, the cloud infrastructure is an absolute necessity. Since data localization is a priority, cloud infrastructure providers have been progressively deploying data centers within dozens of countries, increasing supporting data localization requirements and enabling the highest levels of security and compliance in the industry. The door is open for cloud infrastructure providers to become key enablers of international standards. The EU’s B1MG initiative might become one of the first demonstrations of how global standards can accelerate innovation in Healthcare and Life Sciences for the benefit of both the public and private sectors.
To make patient data available for research and AI/ML applications, it is essential to align infrastructure development across the public and private sectors. It is fundamental for commercial organizations to partner with national governments, especially now that challenges and opportunities around infrastructure and standards for Precision Medicine are more prominent than ever before. There has never been a better time to get involved and help drive the promise of Precision Medicine, which can be summarized as to “prevent illness before it happens”. Our patients need this, today!
Dr. Alessandro Riccombeni – enlightenbio Guest Blogger
Dr. Alessandro Riccombeni is Head of Precision Medicine, EMEA for Amazon Web Services (AWS). He has more than 13 years of international experience in genomics research, product development, and business strategy across the US and EMEA. Prior to joining AWS, Alessandro was Director of Business Development at DNAnexus, supporting collaborations between public and private sectors to enable population scale genomic initiatives. In industry, he led the development of cloud platforms for bioinformatics and commercial services for gene editing. Alessandro started his career in research, leading projects in Oncology and Infection Biology for 6 years before moving into industry. Alessandro holds a PhD from the University College Dublin and an MBA from the London Business School.