HCMI has released 148 patient-derived next-generation cancer models from 18 tumor types! Masked somatic MAF data are now available for a subset of models. Users can now search and filter these models by available somatic variants on the HCMI Searchable Catalog.
Accessing HCMI Data
The Human Cancer Models Initiative (HCMI) is a community resource of next-generation cancer models, derived from parent tumors which span a range of cancer subtypes. The models also inlcude those derived from individuals of diverse ethnic and racial backgrounds as well as from rare adult and pediatric cancers. The models and their associated normal and parent tumor tissues are annotated with clinical, genomic and molecular data.
The user guide explains how to access the HCMI data.
A model's case-associated data include data from the derived-model, originating tumor tissue and normal tissue. The clinical, genomic and molecular data from HCMI cancer models, matched normal, and tumor tissues are quality-controlled at each step of the cancer model development pipeline. The quality-controlled and harmonized data are available at NCI’s Genomic Data Common (GDC).
Open- vs. controlled-access is defined by the NIH data sharing policy. The HCMI follows the NIH’s human subjects’ protection and data access policies to ensure the privacy and confidentiality of the research participants. HCMI data are available to the scientific community in two tiers: open- or controlled-access. Both types of data can be accessed through the GDC.
Open-access data presents minimal risk that a participant can be identified. HCMI provides the scientific community the maximum amount of open-access data allowable under HIPAA guidelines. Access to this data does not require data use certification.
Examples of open-access data are:
- De-identified clinical information
- Biospecimen data including tissue pathology
- Tumor- and model-associated somatic mutations
- Gene expression data
Controlled-access data is stripped of direct participant identifiers as defined by HIPAA. Controlled-access data contains genomic information that could identify the patient.
Examples of controlled-access data are:
- Raw sequencing data for WGS, WXS or RNA-Seq
- Harmonized datasets which contain germline variants
Access to this data requires user certification which can be obtained through NCBI’s dbGaP (National Center for Biotechnology Information’s database of Genotypes and Phenotypes). Researchers may apply for dbGaP access by filling out a Data Access Request form. Read “How to Access Controlled Data” below for more information.
Obtain Data Use Certification (DUC) through dbGaP
- To access HCMI controlled-data, all investigators must submit a Data Access Request (DAR) through dbGaP. Visit the general overview on accessing genomic data and video tutorial for guidance.
Note: NCI intramural investigators must submit a dbGaP account activation request before submitting a DAR. Contact the NCI Office of Data Sharing for instructions.
- All investigators must have an NIH eRA Commons account or HHS credentials (intramural investigators only).
- Approved users may access controlled HCMI data through the HCMI-CMDC page at the GDC. For more information, visit the GDC controlled data access process page.
Get Help If You Have Trouble Accessing Data