Public database aids drug researchers

May 30, 2006

Public database to enhance discovery of medicines

Researchers at the Broad Institute of Harvard and MIT have released ChemBank 2.0, a major upgrade to ChemBank, a publicly available database poised to enhance scientists' capabilities in drug discovery.

The web-based ChemBank includes data on drug candidates (called small molecules) and their behavior in cells selected to serve as models of human disease, especially cancer. Using ChemBank's analysis tools, investigators can query and analyze these freely available data and may even export the raw information to perform their own unique analyses. By these mechanisms, ChemBank enables researchers to gain new knowledge of human disease and to identify starting drug candidates for novel therapeutics.

"By connecting many aspects of biology and medicine with many drug candidates, ChemBank helps the drug-hunting community become more than the sum of its parts," said Stuart Schreiber, Director of the Initiative for Chemical Genetics, the platform behind ChemBank's release.

ChemBank currently reflects the participation of over 200 biomedical researchers, chemists and computational scientists nationwide who have agreed to work in an open data-sharing environment. It contains information and data on over 700,000 small molecules and 16 million measurements using cells treated with small molecules.

ChemBank's latest release is the newest development of an innovative approach to the acceleration of drug discovery launched by the National Cancer Institute (NCI) in 2002. The NCI's Initiative for Chemical Genetics (ICG) supports the synthesis of small molecules and the screening of these compounds for their effects on specific biological activities, especially those related to cancer. Central to the ICG's efforts is to make this information publicly accessible, thus further enabling the larger drug-discovery community. Participation in ICG’s current open data-sharing environment involves signing a data-sharing agreement (DSA), which ensures that data gathered at the ICG will be available for one year to all signees of the DSA (held in "ChemBank-DSA"). After that period, all data are released to the public in ChemBank.

"ChemBank is essentially a matrix linking many different small molecules with many states of cells – from healthy to diseased," said Schreiber. Small molecules are best known as medicines, like aspirin, but also play vital roles in all living organisms. They include sex hormones and neurotransmitters. They function by attaching to proteins, the workhorses of all living cells, and can inhibit or promote the actions of that protein in the cell, making them especially important targets of cancer research.

The Broad Institute's Chemical Biology Program, which houses the NCI's ICG, continues to be a leader in the synthesis and screening of small molecules, and in determining the effects these drug candidates have on specific disease-relevant biological activities.

ChemBank 2.0 incorporates many new search and data-mining capabilities and a standardized interface that will facilitate connections to the NCI's informatics grid, CaBIG. In addition, the new ChemBank infrastructure is designed to expand with the sophistication of these techniques as well as to reflect the growing interconnectivity of the medicinal research community. Planned enhancements include the addition of microscopy, RNAi screening, and proteomic data.

"The whole thrust of ChemBank in the near term will be to reach out to the scientific community and to provide an information hub for drug-discovery research," said Paul Ferraiolo, Head of ICG's Software Engineering team.

ChemBank may be accessed on the web at This site replaces the earlier version that had a strong component of highly descriptive biological activities for small molecules, obtained by mining the medicinal research literature, but otherwise limited in interactive functionality.

ChemBank was established and funded entirely with Federal Funds from the National Cancer Institute, National Institutes of Health, under Contract No. N01-CO-12400.

About the Broad Institute of Harvard and MIT The Broad Institute of MIT and Harvard was founded in 2003 to bring the power of genomics to biomedicine. It pursues this mission by empowering creative scientists to construct new and robust tools for genomic medicine, to make them accessible to the global scientific community, and to apply them to the understanding and treatment of disease.

The Institute is a research collaboration that involves faculty, professional staff and students from throughout the MIT and Harvard academic and medical communities. It is jointly governed by the two universities.

Organized around Scientific Programs and Scientific Platforms, the unique structure of the Broad Institute enables scientists to collaborate on transformative projects across many scientific and medical disciplines.

For further information about the Broad Institute, go to

For more information, contact:


Last updated: November 01, 2018