For four days, San Francisco’s Japantown was the center of genomics

At the end of June I was fortunate enough to attend the second annual Clinical Genome conference, a meeting hosted by Cambridge Healthtech Institute. The conference was divided into two sections: “The Science of Investigation and Interpretation” (all about science) and “The Business of Integration and Implementation” (all about business and regulatory aspects). Kudos to the organizers for selecting a great set of speakers and a terrific location, smack in the middle of Japan Town in San Francisco, surrounded by good restaurants and a nice coffee shop right across the street. I want to share with you some of the highlights from the science section of the conference.

Let me start out with some interesting quotes I heard over the course of the first two days while attending the science section:

B. Korf: Genome is the library of life, and not the book of life – we can read the words, but we do not understand them.
B. Korf: It is a HIPAA violation to look up your own medical record.
B. Korf: What about an iTunes of medical records? Similar issue for medical genomics data to what was an issue before for music.
H. Rehm: 66% (or 81% of variants for hearing loss) of variants identified were only found in one disease case.
H. Rehm: Problem of individual variants being double counted. Individual patients need to be barcoded so that they are not shown multiple times in multiple studies. Possibly add a patient barcode that can be added to each study.
R. Scott: The gene in biology is what the atom is in chemistry.
R. Scott: Decreasing genetic testing cost will increase clinical and personal utility.
M. Kircher: It cannot all be black and white; when it comes to genomics sequence data we have to accept uncertainties.
G. Lyon: “Proprietary databases” are like “walled gardens” – we need to get into this pre-competitive space and share data with each other!
G. Lyon: Two years ago it was OK to do anecdotes – now it is better to get together for a systematic approach (sequence millions and get better variant calling methodologies).
G. Lyon: With exome sequencing you would only look where the light is.

The conference was kicked off by Kevin Davies (founding editor-in-chief of Bio-IT World and now VP of Business Development & Publisher at C&EN at the American Chemical Society). Kevin started out with a nice anecdotal story highlighting the 10th anniversary of the declaration of the genome showing a slide of the Leicester scientist group that printed an entire whole genome in 130 books to demonstrate just how much information is in it – the printing of which costs nowadays a lot more than actually sequencing it (currently at ~$5,000)!

The first speaker of the morning session was Bruce Korf (University of Alabama at Birmingham). The focus of his talk was the integration of genetics into the day-to-day workflow of clinical medicine. He highlighted that one should look through three lenses: (1) prevention, (2) diagnostics, and (3) treatment. After a deeper discussion of each point, Korf pointed out that one of the bigger issues of personal genomics is the fact that it is a “disruptive technology” and that in an ideal scenario, there should be one doctor and one counselor per patient — clearly, this will not work in today’s health care world. And of course there are the incidental findings, for which the physician uses his/her own judgment on which results to present to the patient. This can create a huge burden for both the physicians and the patients. To sum it up, Korf put a big emphasis on education for the path forward, as medicine becomes too difficult and complex. A new training paradigm is required that opens the door to non-geneticists. What we need is point-of-care decision support tools that allow the health care providers to explain the “why” in addition to the “what” and the “how.”

Next up was Heidi Rehm (Brigham and Women’s Hospital and Harvard Medical School), who is also director of the Laboratory for Molecular Medicine at Partners Healthcare Center for Personalized Genetic Medicine (LMM), which currently offers more than 150 tests in cardiovascular disease, cancer, hearing loss, and pharmacogenetics. Rehm pointed out that there are still substantial limitations in the analytical validity of NGS and in calling accuracy for variants, indels, CNVs, SVs, and repeats. Rehm covered a couple of interesting use cases that demonstrated the value of targeted gene panel sequencing or whole genome sequencing. In cases where interpretation was challenging, they have developed the “Matchmaker System” for clinical labs to work with researchers to help resolve unsolved exomes and whole genomes. The Matchmaker System, part of PhenoDB, is currently developed in collaboration with Ada Hamosh (Johns Hopkins University). Additionally, Rehm discussed the interpretation bottleneck. She noted that LMM so far has interpreted about 15,000 variants in different patient reports to date and that there is no way to read through the many different papers to evaluate the evidence on a scalable level; automation is clearly needed but challenging. Rehm pointed out that 66% (or 81% of variants for hearing loss) of the variants identified were only found in one disease case. To address this dilemma, standardization is an absolute requirement, with a centralized database and an evidence-expert consensus process. As a result, they launched the ClinVar database in April of 2013. ClinVar today has more than 40 contributing labs and over 42,000 variants, of which 5,000 have been added by InVitae. ClinVar, which is publicly available, is a clinical-grade database with both pathogenic and benign variants.

Randy Scott (InVitae) gave a great talk summarizing activities to achieve InVitae’s goal: “Bring comprehensive genetic information into routine medical practice to improve the quality of healthcare for billions of people by aggregating the world’s genetics tests into a single assay with better quality, faster turnaround time, and lower price than most of the single-gene tests today.” Scott mentioned that the test is currently available for early commercial access and contains up to 211 genes. The initial focus will be on rare diseases. InVitae, which started last year, has a clinical lab and about one scientist per software engineer. Scott outlined the company’s strategy with 4 steps:

organize all the world’s clinically relevant genetic information – with currently about 500 genetic conditions curated and more than 32,000 clinically relevant variants documented and in their database. For this process, they plan on automating the data analysis infrastructure for fast variant calling. In this context Scott mentioned the recently launched “Free the Data” movement, of which the data will be populated into ClinVar and other public databases with each patient’s consent. Their goal is to have the data for all Mendelian diseases.
build an industrial-strength clinical genetic testing infrastructure
build a comprehensive genome management infrastructure
global marketplace – all researchers share access

With the following business principles:

patients control their own genetic information and how it is shared
order acceptance via physicians only
support the elimination of DNA sequence patents
decreasing genetic testing cost –> increase clinical and personal utility –> Free the Data movement

Jose Jarvis (Coriell Institute for Medical Research) gave a great overview of Coriell’s personalized medicine effort, which was launched in 2007. Jarvis pointed out that the study, with its more than 7,500 participants, has been tremendously valuable in establishing a rigorous process for genomic testing, data interrogation, and genomic output / report generation. The lessons learned will be directly translated into a new spinoff called Coriell Life Sciences, which will offer novel approaches to genomics testing.

On Day Two of the interpretation part of the conference, Catherine Brownstein (Gene Partnership, Boston Children’s Hospital) talked about the need for clarity in clinical genome sequencing. “Clear and consistent methods of clinical data approaches for patient care” was identified by a set of key individuals in the space as the biggest barrier. To address this issue, the CLARITY (Children’s Leadership Award for the Reliable Interpretation and Appropriate Transmission of Your genomic information) challenge was created as a crowdsourcing approach to data analysis. A total of 23 teams submitted entries for the analysis of three different data sets (all IRB-controlled data) and the winner (Brigham and Women’s Hospital, Boston) was announced at ASHG 2012. Most interesting was a follow-up survey (100 questions sent to the 23 participants) that they conducted and shared at the conference. Here are a few excerpts, but I understand the full survey will be published later in the year:

Among the different teams, widely varied approaches were used to tackle the analyses — commercial, open source, and in-house tools were used
60% applied variant filtering and recalibration to their analysis — filtering variants by relevance to pathogenicity was considered important
For variant analysis, most did their own interpretation — 10% used Ingenuity’s Variant Analysis tool
65% used their own annotations
Across the analysts, opinions differed greatly about the necessary coverage
In general, there was a bigger concern about false negatives than false positives
Reporting methods were not uniform at all

David Mittelman (Bioinformatics Institute, Virginia Tech) introduced to the audience the recently launched GCAT platform (actually rolled out earlier this year at Bio-IT World in Boston). This may be a very interesting and valuable tool, as it allows researchers to compare different analysis tools and their performances next to each other. Here are some of the highlights of what GCAT can do:

Compare outputs, rather than analyzing a specific data set
Measure the performance of different analysis tools and assess their functions to understand where they perform to satisfaction
Allows researchers to collaboratively test and validate different tools for their specific data and or analysis workflows
Setting standards via crowdsourcing

Next up was Gholson Lyon (Cold Spring Harbor Laboratory), who gave an excellent talk on some use cases highlighting the need for more accurate variant calling for personal genomics. One of the syndromes presented was the case of 2 brothers with intellectual disability, autism, ADHD, and hearing difficulties. After many tests, whole genome sequencing across the entire family helped identify a mutation in the transcription factor ZNF41 that may explain the disease. This example demonstrated that some of these diseases are so complex in nature and require a “networking of science” model, such as an online database with genotype and phenotype information. His take-home message was: “Two years ago it was OK to present anecdotes, but now we have to do better and take a more systematic approach. We can now sequence millions and have access to better variant-calling methodologies.”

I came away from the conference feeling enthusiastic about the future of genomics in the clinic, but more aware than ever of the many roadblocks between here and there. I will be keeping an eye on several of these projects in the coming months and hoping for significant progress that will lead to better treatment options and health outcomes for patients.