Over the last couple of months, I’ve come across a great number of tools that engage the crowd to solve a problem. I wanted to dig a little bit deeper to understand some examples of what has been done using the crowd. Image credit: CoolTown Studio
Crowdsourcing applies to a wide range of different activities, which includes crowd-funding, crowd-voting, crowd-based competition, crowd-based gaming, and the recently added crowdcoding. Crowdsourcing, before it entered the healthcare sector, was used in gaming and other industries. One particularly well-publicized effort was the Netflix approach to engaging the crowd in 2009, offering a $1 million prize for the best algorithm for recommending movies based on past preferences.
Engaging the community, or “crowding them,” has recently been picked up in basic research and the medical/healthcare sector with some great projects. With this post, I wanted to review some of those initiatives, what makes them interesting and worth mentioning, and why we should apply similar approaches to the world of healthcare and genomics. I believe only this way will allow us to reach a common goal: faster and better patient care. It is exciting to see this belief is shared with others in the life science community, as demonstrated by the announcement of the “Empowered Genome Community” by Ingenuity Systems, a Qiagen company. People who have had their genome sequenced can join the community, upload their data, and utilize the pool of data to analyze their own genome — all for free. As Nathan Pearson from Qiagen pointed out in the press release, “The Empowered Genome Community adds a key piece to public sequencing efforts like the PGP: a way for citizen-scientists to explore their data, together with full-time researchers, to spark new insights for common good.”
As you’ll see from the examples I have included below, marshaling the intelligence and human element of the crowd allows for scaling of data analysis, faster and possibly more accurate results, and knowledge extraction, all of which are extremely important elements when it comes to personal healthcare.
What are some of the crowd / community-based examples in life sciences?
HealthTap: A great example of a community effort! The business is based on the proposition that a battalion of doctors and health experts can provide better and possibly wiser information to help others make intelligent decisions. HealthTap is a combination of crowd-sourced health advice and community-building. It helps people answer basic questions such as, “What possible conditions match this symptom?” or “What should I do about this problem?” HealthTap does not actually provide healthcare online, but it provides information directly from doctors on specific health issues that concern individuals.
CrowdMed: Instead of relying on individual physicians, CrowdMed uses the collective intelligence of thousands for accurate and insightful diagnostic suggestions. Their goal is to be relied upon by millions of patients as a trusted source of medical answers. Complicated symptoms and conditions can be diagnosed by everyone, with or without a medical background, in an effort to reduce the barriers of communication often seen in hospitals or with specialists. It basically uses the same concept of collaborative process seen in teaching hospitals, where physicians and trainees work together to solve cases. Large crowds working to solve a problem result in a faster diagnosis.
FDA Adverse Event Reporting System (FAERS): A system to voluntarily report a serious adverse event or product quality problem associated with the use of post-marketed drug, biologic, medical device, dietary supplement, or cosmetic. These voluntary reports are very critical to ensure patient safety. The FDA relies on it, as some of the drug effects are only observed after they have been released to a wider audience. The database is designed to support the FDA’s post-marketing safety surveillance program for drug and therapeutic biologic products. The data is available for download.
Evidence of drug interactions from search log data: An interesting paper summarizing an approach of using the crowd for ADR reporting, a way to scale pharmacovigilance via listening to signals from the crowd. Compared to analysis of other sources such as electronic health records (EHR), logs are inexpensive to collect and mine, are not dependent on healthcare utilization, and are not subject to the same latencies. The results demonstrate that logged search activities by populations of computer users captured by internet services can contribute to drug safety surveillance.
Crowdsourcing via gaming
Malaria study: A malaria study that employed crowd-sourced gaming to investigate whether the innate visual recognition and learning capabilities of untrained humans can be used to conduct reliable microscopic analysis of biomedical samples toward diagnosis. For this purpose, they designed an entertaining digital game that interfaced with artificial learning and processing back-ends to demonstrate that in the case of binary medical diagnostics decisions (e.g., infected vs. uninfected), with the use of crowd-sourced games it is possible to approach the accuracy of medical experts in making such diagnoses. Specifically, using non-expert gamers, the diagnosis of malaria-infected red blood cells was reported with high accuracy.
Data analysis using the crowd
E. coli O104:H4 Crowdsourced Genome Analysis: In 2011, the crowd was gathered to analyze the deadly Shiga toxin-producing E. coli O104:H4 strain responsible for the outbreak in Germany and Europe. The release of the data for several different isolates triggered a frenzy of crowdsourced analyses by bioinformaticians across the globe. Only a day later, a de novo assembly of the genome had been produced; within a week, more than 20 entries had been filed on a new website dedicated to genomics of the strain, revealing details of its pathogenic potential and evolutionary origins.
Matchmaker system for Exomes and Genomes: A system developed for clinical labs to work with researchers to resolve unsolved exomes and whole genomes (for cases where interpretation was challenging). The Matchmaker System is part of PhenoDB and developed in collaboration with Ada Hamosh (OMIM/Hopkins), as discussed earlier this year by Heidi Rehm (Partners HealthCare) at several conferences.
ClinVar: Understanding the impact of genomic variation requires classifying variants and annotating them with pathogenicity information. Interpreting variant data has become a major bottleneck. With the mountain of data generated, analysis is simply not scalable anymore and can only be addressed by engaging the community. Out of this need ClinVar was created in April 2013 to be a variant annotation database with the purpose of depositing and sharing curated genomic variants and making them available to the community. As of June 2013, ClinVar had more than 40 contributing labs and over 42,000 variants, of which 5,000 have been added by InVitae.
This development was further substantiated with the recent announcement of a $25 million, four-year grant awarded by the NIH to three research groups to develop authoritative information on the millions of genomic variants relevant to human disease and the hundreds that are expected to be useful for clinical practice. The grants will support a consortium of research groups to develop the Clinical Genome Resource (ClinGen). The investigators will design and implement a framework for evaluating which variants play a role in disease and those that are relevant to patient care. The information obtained will then be distributed via the ClinVar database.
uBiome: uBiome is most probably one of the most publicized crowd activities in our space. uBiome raised $350K in crowdfunding, the largest citizen-science project so far. With uBiome, the user owns his or her own data, but can share it with the rest of the community as a public data set or can even make the data available for research purpose. uBiome’s goal is to research the personal microbiome, how it correlates with other microbiomes, and how it’s correlated with different conditions such as asthma, diabetes, autism, depression, irritable bowel, Crohn’s disease, chronic sinusitis, heart disease, etc.
Free the Data: Last but not least, Free the Data is a great initiative that builds on the idea that genetic hereditary breast and ovarian cancer information (BRCA1 and BRCA2 mutations) is more valuable when shared. Sharing this information helps the medical community understand how these mutations impact disease and therefore helps clinicians improve patient care. Making these mutations a trade secret is hurting more than helping. Free the Data empowers women (and men) to have an impact on their own health and that of others.
And the list goes on! Here are just a few more examples of many great projects out there:
Built a chair from thousands of butts
Open Dinosaur Project
Antimatter experiment seeks help from the crowd
Identification of new pulsars
Crowd-sourced data coding for the social sciences
Business software reviews
Crowd Capital generation