Open Source and Crowdsourced Models in Pharmaceutical Development

The development of new pharmaceutical products is expensive and increasingly more so. Recently published studies estimate the cost for each new prescription drug approval to average $2.5-$5 billion, including the development costs of the successful compound along with the associated failures along the way.[1] In recent years, pharmaceutical companies have been looking to creative models to curb the drastic increase in development costs. Among other creative solutions, they are experimenting with introducing open source and crowdsourced aspects to various parts of the drug development cycle.

The term “open source” comes from the software industry where a piece of software is considered “open source” if the source code for the software is freely available to the public to use and modify. For example, the Linux® operating system platform is free to download for public use, and users create modifications to improve the platform that are then made available to the broader Linux® community. While the definition of “open source” outside of the software industry is less defined, typically an “open source” research project in the medical field is one where the protocols for the research and/or the data resulting from the research are made freely available to the public for use in further research and development instead of being kept confidential.

A project is considered “crowdsourced” if the research is performed by individuals or teams, typically unrelated to the requestor, who are not hired in a typical fashion to perform the work. Crowdsourced research may occur in a competition setting or in a setting of open collaboration among otherwise unconnected individuals. One of the most widely known examples of crowdsourcing is Wikipedia (, a website where members of the general public contribute their knowledge about every topic under the sun by writing pages for and editing an online encyclopedia. Not surprisingly, among its plethora of information, Wikipedia has a page dedicated to a constantly evolving list of crowdsourcing projects available at

Many parts of the drug development process can benefit from an open source and/or crowdsourced model. Pharmaceutical and other medical research companies are trying many of them, including obtaining samples of biological materials, evaluating how biological structures are likely to be oriented, developing algorithms for identification of likely successful compounds, creating clinical trial protocols, and developing treatments for challenging medical conditions. This article explains a little about each of those ongoing activities and explores the likely intellectual property implications of them.


At least one organization has developed a unique way to obtain free genetic material for its research—The American Gut Project ( and The American Gut Project, along with similar projects analyzing samples from individuals around the world, is researching a hot button topic in the medical arena: the microbiome of the human gut and its relationship to disease. It bills itself on its website as “one of the largest crowdsourced, citizen science projects in the country.” People pay $99.00 for a kit to collect samples of microbes from their skin, mouth, and fecal matter and then send them to the American Gut Project’s lab. The lab analyzes the microbial content of the samples as well as information from the donors about key variables that could influence the makeup of their microbiomes, like diet, exercise habits, and geographical location. It then adds the information to their collected research, which is de-identified and made publicly available, and returns a personal sequencing and analysis to the donor with individualized information about their own microbiome and how it may be affected by their personal experiences. In this way, the project receives for free the biological material that it needs while having the public pay for the donation and the analysis under the premise that a service is actually being performed for the donors. The project uses aspects of crowdsourcing by obtaining the samples from the general public and is also an open source project. The intellectual property rights in the combined de-identified results are granted freely to the public for general interest and to researchers for analysis and use in further research.


For its crowdsourced research, the University of Washington has created a gaming platform where users compete to fold proteins in new and creative ways ( The Allen Institute of Brain Science and the Center for Game Science at the University of Washington has created a similar platform where users play games to build models of brain cells ( All viruses, bacteria, and cancers have proteins involved in their occurrence. The way that a protein is folded predicts what compounds are likely to interact with it, so knowing how those proteins are folded can help researchers create potential treatments for various conditions.

In the FoldIt game, users are trained through a series of easy games to learn the rules of how proteins tend to fold. Next, users advance to trying their hand at folding more complex ones. Points are awarded based on how well a user’s protein fold conforms to the rules of folding, and winners are declared. One might ask though, if there are set rules to how proteins are folded, why have humans fold them rather than running the proteins through computer algorithms? The answer to that question sits in the level of complexity of proteins and their degrees of movement. Having a computer attempt all possible folds for a single protein would take a very long time due to the number of potential permutations of the fold. That being the case, though, some folds are more obvious than others, and humans have an innate analytical ability to rule out all the folds that would be clearly incorrect much better than a computer can. The FoldIt game capitalizes on this human intuition. The game includes basic programs to assist in working out minor kinks, while the human makes the major decisions about folding.

FoldIt is open source in the sense that the results of all folds that are competition winners are made publicly available and free for viewing. Any researcher can use the results of the protein folds in their own research to attempt to find compounds that will interact with the proteins that are folded as discovered in the game.

Mozak has a similar concept to FoldIt, but deals with neuron mapping of brain cells rather than mapping of proteins. It capitalizes on the human eye’s ability to trace a neuron’s structures in three dimensions—a task which is exceedingly difficult for a computer. In future versions of the game, Mozak intends to have humans assist with classifying various neurons based on their structure to help predict the likely function of various neurons. Mozak is still in the early stages of development, but it appears that it will be similar to FoldIt in its implementation. It is currently unclear whether results of Mozak’s research will be made available to the public or whether they will be privately used by the Allen Institute for Brain Science in its own neuroscience research.


Open source platforms and research are great for the general advancement of science, yet they are not always the most practical way for commercial entities to make money from their results. In that arena, platforms like Topcoder (, Kaggle (, and InnoCentive ( allow commercial entities to launch competitions to specific communities, such as the coding community, where monetary prizes motivate the competitors and the sponsors retain intellectual property rights in the submissions. These platforms are not limited to use by pharmaceutical companies, but competitions on Kaggle have shown to be popular in that context, including:

  • a Genentech competition to advance Cervical Cancer Screening that attracted 40 teams and granted $100,000;
  • a Genentech competition to predict when, where, and how strong the flu will be that attracted 50 teams and granted $125,000;
  • a Pfizer private, invitation-only competition for prescription volume prediction that attracted 12 teams for an undisclosed prize amount;
  • a Merck competition to predict molecular activity that attracted 236 teams and granted $40,000;
  • a Boehringer Ingelheim competition to predict biological responses to molecules from their chemical properties that attracted 699 teams and granted $20,000.

If one thinks about it, in a standard paid research arrangement, paying $20,000 to have 699 teams of individuals attempt to come up with an algorithm that will decrease a pharmaceutical company’s time and expense in molecule selection is an incredible deal. InnoCentive is similar to Kaggle and has 150 past and present competitions across the array of global health as well as a separate section solely with challenges run by AstraZeneca. Some of the InnoCentive challenges result in exclusive licenses or other rights being granted to the sponsoring companies, while others look for more general ideas than the Kaggle competitions and have the potential for the future negotiation of licenses, engagement for future research, or even employment for the winners.


Transparency Life Sciences ( (or “TLS”) is an all-digital clinical development services company seeking to increase efficiency and patient relevance in clinical trials. TLS uses crowdsourcing and mobile health technology (i) to create clinical trial protocols for client compounds using a proprietary, web-based software module called “Protocol Builder,” and (ii) to conduct cost-reduced clinical trials using telemonitoring technologies that minimize the need for patient site visits and deliver more informative and relevant data than traditional trials. TLS uses crowdsourcing and open source models for protocol development and patient recruitment, and digitizing, within reason, every part of a clinical trial to reduce drug development costs while bolstering clinical trial quality and data. The company seeks crowd input from patients, doctors, and researchers to review, modify, and affirm the draft study parameters that the TLS team and its partners have formulated. Contributors are rewarded for participation and for selection of their ideas via elevation to leadership roles within the community, along with the potential opportunity to co-author scientific papers based on findings of the studies. Patients are motivated to contribute by the desire to make a difference for others dealing with their ailments, while researchers and prescribers with relevant ideas and opinions are given the opportunity to be heard in their fields, even if they are not the key opinion leaders biopharma sponsors typically consult. TLS is focused on transparency and open source access to its data and clinical results consistent with the needs and preferences of its clients. Ideally, the results of many of its trials will be available to the wider community to analyze, interpret, and use in research. Among other projects, TLS has worked in collaboration with Genentech to conduct a pilot study of inflammatory bowel disease patients and with Auven Therapeutics to develop the compound Kiacta for the rare medical condition, pulmonary sarcoidosis.


The medical community at large, sponsored generally by public health entities and organizations like Doctors Without Borders, has created a number of open source programs for the research and development of treatments for key critical diseases having urgent need for better treatments. A group formed in 2014, Open Source Pharma Foundation (http://www., has been pushing to create the “Linux for Drugs” model for development of affordable pharmaceuticals for poorly served conditions. Their website is a wealth of information, and their collaborative foundation represents input from across the life sciences spectrum from Doctors Without Borders to governmental institutions to commercial pharmaceutical companies.

There are also specific programs aimed at finding cures for specifically identified poorly served conditions. One good example of such a program is Open Source Malaria ( The Open Source Malaria program uses a distributed collaborative research model with an open “to do list” with details of all aspects of research the program needs. Tasks range from the very simple to the much more complex, and contributors have ranged from researchers to primary school classes. All research results are made public in the spirit of open source innovation, and researchers are unpaid, instead participating in research for the greater good. It will be interesting to see in the future what results come of such open source development without the traditional commercial drivers motivating development.


Crowdsourced and open source models of research come in so many variations that there is no simple explanation for what they mean in a legal context. Open source models can have all research results made public with the requirement that any derivative works also be made public. Even so, there are possibilities for commercial entities to pick up some of the open source data to create proprietary products from which they will profit. Crowdsourced models similarly can be either competitions, where results are owned by the sponsors and research is paid for via prize money, or other models, where the sponsors only gain non-exclusive licenses or rights to negotiate licenses to the results and all potential solutions are publicly shared.


The use of open source and crowdsourced models in the development of pharmaceutical and other life sciences products is very new. What drug development may look like in the not-too-distant future will likely largely depend on the success of these early projects. Still-open questions include how motivated researchers will stay when they are working for uncertain prize money or the greater good of humanity rather than a set wage, and how companies will make it financially worthwhile to pick up products that may have incredible promise but open data and no mechanism for launching in a brand form, the way pharmaceutical companies traditionally recoup their development expenses. There are many ways that the future could unfold in this space, and it will be very interesting to see the course that pharmaceutical research takes through these new waters.

[1] See, e.g., Matthew Herper, The Cost of Creating a New Drug Now $5 Billion, Pushing Big Pharma to Change, Forbes (Aug. 11, 2013, 11:10 AM),; Joseph A. DiMasi, Ph.D., Director of Economic Analysis at Tufts Center for the Study of Drug Development, Innovation in the Pharmaceutical Industry: New Estimate of R&D Costs, Address Before Tufts Center for the Study of Drug Development (Nov. 18, 2015), available at,_2014..pdf.