The following blog is a repost of a recent post by Tal Behar, Co-founder & Executive Director at PMWC Intl, on LinkedIn Pulse. We believe this post – a set of questions and answers to the UCSC Genomics Institute’s director David Haussler – is of interest to the enlightenbio audience which is why we requested the repost. It addresses big genomic data questions related to the clinical setting, such as large-scale genomics activities under way or in the process of getting started, the efforts of the Data Working Group of the Global Alliance for Genomics and Health (GA4GH) and why this working group is needed, observed challenges with data storage, analysis, and sharing platforms, and last but not least what other organizations or commercial companies can learn from organizations like the UC Santa Cruz Genomics Institute.
Of particular relevance in this context is the fact that Professor David Haussler is chairing a session at the upcoming annual Personalized Medicine World Conference, PMWC 2016 Silicon Valley, on “Data Solutions in Clinical Genomics”. It’s noteworthy to mention that
this session includes a panel with Jonathan Sheldon (Global Vice President Healthcare at Oracle), Carlos Bustamante (Professor of Genetics at Stanford University School of Medicine), and Taylor Sittler (Co-Founder and Medical Lead at Color Genomics). Undoubtedly, this particular session will be generating interest among many exciting other program aspects at this upcoming PMWC conference. If you are unfamiliar with the PMWC conference, this is an event that has grown substantially over the years into a very established and noteworthy conference that is attended by many leading authorities and experts across the entire healthcare and biotechnology sector. In addition to boasting a panel of great speakers, this conference also represents an amazing networking event with about 1,200 attendees. We at enlightenbio are looking forward every year to attend this conference and to hear first-hand about the recent developments in the field of Personalized Medicine.
Following is the interview conducted by PMWC Intl. with Professor Haussler.
PMWC: What are some of the biggest challenges when it comes to working with genomics data in the clinical setting and what are suggested solutions to address these challenges?
DH: As we start sequencing genomes, individual companies, institutions, and laboratories find genetic variants that they have not seen frequently before and are not sure how to interpret – they could have important clinical implications, or they could be totally harmless. Because there is significant variation in the human genome from one individual to the next, and because we have not yet sequenced enough genomes to be able to assign risk to the majority of variants, a significant fraction of the cases seen by typical institutions will be so-called “variants of uncertain significance.” Many of these are very similar to, but not exactly like, those that are known to have health consequences, so clinicians often take conservative preventative measures to be safe. This can lead, in some cases, to unnecessary medical procedures and responses and result in undue health anguish. Thus, a fully genetically informed clinical decision process is profoundly important for the patient.
PMWC: Why are large-scale genomics efforts needed? How are you and your team involved?
DH: The only way to deal with variants of unknown significance is to create a large-scale genomic knowledge network that includes all the genetic variants observed worldwide, and their associated health observations. This requires a global network of shared data. The vast majority of human genetic variants are individually rare but collectively common, so this effort will have health impact on almost everybody. My team is helping to build the tools and approaches necessary to share data responsibly and effectively.
PMWC: There are numerous large-scale genomics efforts under way or in the process of getting started.
- Will the size, in terms of number of samples, of these sequencing projects be sufficient to fully understand most rare diseases?
- If not sufficient, how can this be addressed better – algorithmically, via data sharing platforms, or other?
DH: It is wonderful that several countries are launching very large case cohort studies, including the Precision Medicine Initiative in the US and the 100,000 Genomes Initiative in England. These will provide consistent and controlled data sets for research that are larger than have ever been available before. However, we must also move very aggressively to begin sharing genetic data from routine clinical practice, which will very soon be generating orders of magnitude more data than all the controlled cohort studies combined. The tyranny of numbers in statistics points to this as the information source of greatest potential power. However, to access that potential and turn routine clinical practice into revolutionary learning for health and medicine, we need to create widely used data accessing platforms that bring all the knowledge together.
PMWC: What are the promises and challenges of sharing clinical and research sequencing data?
DH: We will ultimately attain a whole new level of understanding of how our individual detailed genetic variation influences our health and learn to exploit this knowledge, either by adjusting our diet, behavior, and lifestyle or through direct medical intervention. We’ll be able to monitor and control diseases of aging such as cancer, diabetes, and Alzheimer’s at their molecular basis. We’ll gain a detailed understanding of the current state of our immune system, what it needs to do to make us healthier, and how we can help it along to work for us.
The most difficult challenges to the controlled sharing of genetic data are social challenges, e.g. privacy issues and competition between medical institutions, but there are also significant technical challenges, and there is much interplay between the two types of challenge.
PMWC: You are a co-founder and co-chair of the Data Working Group of the Global Alliance for Genomics and Health. Can you tell us more about the initiative and efforts under way and why this working group is needed?
DH: The Global Alliance for Genomics and Health (GA4GH) is an international network of researchers, clinicians, companies, and patient and disease advocates who are working together to create the harmonized tools and approaches necessary for responsible and effective data sharing. The Data Working Group of the GA4GH is addressing the technical challenges of accessing data distributed globally. We are the geeks that have come together from hundreds of companies, nonprofits, and universities to get the technical data-sharing job done. Our sister Working Groups (Regulatory and Ethics, Security, and Clinical) are handling the complementary issues that put our work into context.
PMWC: What is the ideal data storage, analysis, and sharing platform for genomics data?
DH: We are working on that! It is not an easy question. One thing we have learned is that global social realities dictate that there will not be a single centralized database of all personal genome data. We simply cannot create enough global trust in any single government or privately funded institution for that. However, the Internet itself is an example of a very successful decentralized platform that became sufficiently trusted to attain global use, and that forms the basis for many government and privately funded “apps” that make beneficial use of all kinds of data, including personal data. The GA4GH Data Working Group is essentially trying to extend the Internet a bit further so that it includes a little more specialized infrastructure that understands and responds to the need for accessing global genetic data, while paying careful attention to very important privacy issues.
PMWC: What can other organizations or the commercial sector working on delivering these types of platforms learn from organizations like the UC Santa Cruz Genomics Institute?
DH: One thing that is evident is that it is valuable to have a neutral third party — an entity that does not itself have a competing medical institute, a corporate board of directors with an overriding profit motive, or a political mandate — to convene disparate groups around global standards and shared infrastructure. This is what the UC Santa Cruz Genomics Institute does. We began by posting the first draft of the human genome sequence on the Internet in July of 2000 in collaboration with the International Human Genome Sequencing Consortium, followed through with the UCSC Genome Browser and the CGHub Cancer Genomics database, and are now devoting considerable energy to the GA4GH. We would like to reach out to and encourage everyone who wants to make an impact on human health to join us and contribute to a growing worldwide data sharing movement.