Issue 21 : March, 2019 PDF Icon

HCMI Program Highlights
Human Cancer Models Initiative’s Searchable Catalog of Cancer Models

HCMI is developing next-generation cancer models from several cancer types including rare cancers and cancers from racial and ethnic minority populations. This article introduces an interactive, searchable online catalog of HCMI models where users can query and view, and download model-associated data.

CTD² Guest Editorial
From Variants to Functions - New Strategies for the Interpretation of Cancer Genomes

A key bottleneck in precision oncology is the lack of knowledge of the function of most cancer variants. This article discusses cellular profiling methods developed by researchers at the Dana-Farber Cancer Institute to assess the functional impact of variants. 

CGCI Program Highlights
Epstein-Bar Virus (EBV) Status Identifies Distinct Burkitt Lymphoma (BL) Phenotype in Pediatric Endemic and Sporadic BL

Burkitt Lymphoma (BL) is an aggressive B-cell non-Hodgkin lymphoma with endemic (eBL) and sporadic (sBL) subtypes. eBL is usually associated with the presence of Epstein-Barr virus (EBV). This article focuses on the genetic and molecular distinctions between EBV-positive and EBV-negative BL and EBV's role in tumorigenesis.

CTD² Guest Editorial
Systems Cancer Immunology for the Masses

Cytometry by time-of- flight (CyTOF®) , or mass cytometry, is an antibody-based cost-efficient method to measure the phenotypes of single cells. The workflow for data analysis and applications of the technique in cancer immunology are described in this article.

OCG Perspective
Promoting Scientific Initiatives through Science Communication

Cindy Kyi is a new Health Communications Fellow at the Office of Cancer Genomics (OCG). In this article, Cindy shares her background, aspirations, and responsibilities as a communications fellow at the OCG. 

HCMI Program Highlights
Human Cancer Models Initiative’s Searchable Catalog of Cancer Models

Cindy Kyi, Ph.D., Eva Tonsing-Carter, Ph.D., and Lauren Hurd, Ph.D.
Office of Cancer Genomics, NCI
HCMI Searchable Catalog Logo

HCMI Background

The National Cancer Institute’s (NCI) director, Dr. Ned Sharpless, has made a commitment to focus on efforts in “big data” and build databases as common resources for researchers to accelerate the translation of laboratory findings into the clinic. In harmony with this focus, a goal of the Office of Cancer Genomics (OCG) is to facilitate the development and sharing of resources such as data, tools and protocols within the research community. The Human Cancer Models Initiative (HCMI), one of OCG’s programs, is an international collaboration founded by the NCI, Cancer Research UK (CRUK), Wellcome Sanger Institute (WSI), and the foundation Hubrecht Organoid Technology (HUB). The ultimate goal of HCMI is to provide a community resource of patient-derived next-generation cancer models (e.g. 3D organoids, 2D conditionally reprogrammed cells, etc.) with associated molecular sequencing and clinical data. The motivation behind HCMI’s next-gen cancer models is discussed in this previous e-News article. The models, as well as protocols and model-associated data, are available for the research community to be used in basic and translational research.

NCI-funded cancer model development centers (CMDCs) and HCMI consortium members are creating about 1,000 models from diverse tumor types including rare, pediatric cancers, and cancers from racially and ethnically diverse populations. HCMI models are annotated with patients’ clinical data, the genomes and transcriptomes of the derived model and associated parent tumor and matched normal tissue from most cases. Molecularly characterized sequences and associated clinical data for the models from the NCI-supported CMDCs will be available to researchers through NCI’s Genomic Data Commons (GDC). The HCMI models will be available to researchers through the third-party distributor, American Type Culture Collection (ATCC). 

For the detailed process of NCI cancer model development pipeline, please see the NCI Cancer Model Development page. Once the HCMI cancer model development pipeline is completed, the validated models, clinical and molecular data, and tools are available as a community resource.

As an ongoing community resource, HCMI is developing an online interactive Searchable Catalog for potential users to view the available models and associated information in a centralized place. In this catalog, users can query, select, view, and download available information regarding the HCMI models.

HCMI Searchable Model Catalog

The HCMI Searchable Catalog is a result of joint efforts and collaboration between all members within the HCMI. The HCMI Searchable Catalog contains data elements extracted from the quality- checked clinical and molecular characterization data for each tumor and derived model. The data elements were developed and compiled with input from clinical collaborators and approved by the HCMI steering committee members. The data elements are defined by controlled vocabulary through collaboration with staff from NCI’s Cancer Data Standards Registry and Repository (caDSR). Additional data elements will be added in the future and feedback from users will be taken into consideration.

Browsing the Searchable Catalog Landing Page

HCMI Searchable Catalog landing page

Figure 1: Snapshot of the HCMI Searchable Catalog landing page (numbers on the image correspond to the text below for explanation and are not part of the actual web page).

The catalog allows users to filter and identify models according to data element of interest. The filter panel (1) on the catalog landing page (Figure 1) displays the searchable data elements, which include characteristics of the models such as the primary site of tumor, type of models, tissue acquisition site as well as the clinical diagnosis and background of the patients that the models were derived from. The models that meet the characteristics chosen in the filter panel are displayed in the main display pane (2) on the landing page. Users can choose the data element of interest under “COLUMNS” to be displayed within the main display pane. The catalog also allows users to export the model information displayed in the main viewing panel by using the “EXPORT ALL” function. Users may also save models to “My Model List” (Icon for "my model list") by checking the box next to the model name. “My Model List” models and associated data can be downloaded by clicking the “Download TSV” tab within “My Model List” .

Browsing Individual Model Pages

Users may view detailed information on individual models by clicking on a specific model name under the “NAME” column in the main display pane (Figure 1). Clicking on an individual model name will take users to the model page (Figure 2) with more detailed information about that model. 

Individual model page within HCMI Searchable Catalog

Figure 2: Snapshot of an individual model page within the HCMI Searchable Catalog

Under “Model Details”, users can find details of the model such as growth rate, split ratio, and the type of model as well as tumor-related information such as therapeutic regimen, primary site of cancer, stage of cancer (TNM stage), etc. In addition, “Patient Details” include demographics and clinical background of the patient from whom the tissue sample for the model was acquired. Furthermore, under the “Repository Status”, users can find information about licensing requirements for commercial use, availability of the model at the third-party distributor, and links to external resources including NCI’s GDC website. At the GDC, users can find available associated clinical and sequencing data of the parent tumor, matched normal tissue, and the model. Users who are interested in purchasing HCMI models for use in research may also follow the link to the third-party distributor, ATCC.

Users may view the images of growing cells from a specific model under “Model Images” at different magnifications as available. The cell growth pattern on images will vary according to the model type (e.g. 2D or 3D).

The “Variants” section on the model page includes information on “clinical sequencing”, which shows any collected clinically-derived sequence variants, clinical “histopathological biomarkers” of the cancer as well as information pertaining to each biomarker, and the “genomic sequencing” of the models and associated tissues as available. Users may also enter keywords in the “Filter” box to filter for specific variants (e.g. MSH6, TP53, etc.). Variant data for the model can be downloaded by clicking on the “download TSV” icon, Icon for "TSV".

Users may provide feedback or report bugs to OCG by emailing For more detailed Searchable Catalog information, users may see the HCMI Searchable Catalog User Guide

HCMI’s Continuous Efforts Towards Precision Oncology

HCMI is generating models from breast, colorectal, esophageal, glioblastoma, liver, and pancreas. Models for rare pediatric cancers such as neuroblastoma, Wilms tumor, and Ewing sarcoma are in progress. HCMI also plans to develop models from other cancer types including ovarian, head and neck, kidney, and bladder as well as cancers from racial and ethnic minority populations. The model list will be updated as the models become available.

The aspiration for HCMI is to provide a resource of novel next-generation cancer models that are characterized with clinical and molecular data to the research community. HCMI models and resources such as protocols used for model development and SOPs may provide the research community with the tools to study the molecular pathways that influence tumor development in these cancers and contribute to precision oncology.   

CTD² Guest Editorial
From Variants to Functions - New Strategies for the Interpretation of Cancer Genomes

JT Neal, Ph.D., Jesse S. Boehm, Ph.D., and William C. Hahn, M.D., Ph.D.
Dana-Farber Cancer Institute
JT Neal, Ph.D., Jesse S. Boehm, Ph.D., and William C. Hahn, M.D., Ph.D.

The large-scale sequencing of cancer genomes has identified hundreds of thousands of genetic variants present in human tumors. This variation is often composed of a small number of “driver” mutations, which contribute to tumor development and progression, buried amongst a large number of “passenger” mutations that confer little or no selective advantage to tumor cells. The function of most cancer variants remains unknown, and the ability to identify and link causal variants to disease biology and available drugs remains a key bottleneck to unlocking the potential of precision medicine in cancer. A major goal of the Dana-Farber Cancer Target Discovery and Development (CTD2) Center is the functional characterization of such variants through the development of key technologies and pipelines that will accelerate the translation of thousands of additional cancer variants into disease mechanisms and actionable therapeutic hypotheses.

A significant hurdle to accomplishing this goal is the lack of scalable assays to distinguish passenger mutations from driver mutations across different genes and cell types, a process that now often requires the development of bespoke assays for every gene (and sometimes each cell type as well) that one wishes to study. This often means that a graduate student or postdoc must spend years of their training on assay development, rather than on studying disease biology. In collaboration with investigators at the Broad Institute, our CTD2 center is developing new cellular profiling methods that use changes in a cell’s gene expression profile to assess whether a variant is impactful or not; an approach that overcomes such bottlenecks by enabling an investigator to test variants in multiple genes from multiple cancer types in a single experiment.

In two recent studies1, 2, we demonstrated the effectiveness of this type of approach. In the first, we looked across cancer types and tested 474 mutant alleles curated from 5,338 tumors in pooled tumor formation assays and arrayed expression profiling. Using these methods, we were able to identify 12 transforming alleles - including two in genes (PIK3CB, POT1) that have not been previously shown to be tumorigenic. Additionally, several alleles that were found only once in 5,338 sequenced tumors still exhibited potent activity, demonstrating the importance of functional assays in determining variant impact. In the second study, we focused on lung adenocarcinoma2. Here, we used similar approaches to characterize 194 somatic mutations including rare somatic clinically actionable variants in EGFR, ARAF, ERBB2, and BRAF. We have also shown that transcriptional signatures can be used to quantitatively stratify allele impact into gain, loss, or change of function categories, providing additional resolution. Lastly, we demonstrated that transcriptional signatures can be used to predict more complex phenotypes such as drug-resistance and tumorigenesis. These studies demonstrated the value of combining functional assessment with genomic characterization for the identification of rare tumorigenic variants and serve as an important test case for the use of transcriptional signatures as a proxy for more laborious gene-specific cell-based assays.

We have now extended this work to develop new methods that use droplet-based single-cell RNA sequencing (Figure) instead of bulk RNA sequencing to assess variant impact in pooled format. These new methods will potentially enable massive increases in scale and resolution in cellular screens and allow the simultaneous assessment of large perturbation libraries across numerous cellular contexts in a single experiment. As a proof-of-concept, we have validated impactful mutations in the tumor suppressor gene TP53 and the oncogene KRAS using only single-cell transcriptional signatures as a readout, recapitulating phenotypic data that we had previously generated3 using more traditional gene-specific screening approaches. Our long-term goal is for these methods to ultimately enable massively multiplexed single-cell transcriptional profiling for the characterization of cancer variants, for high-dimensional readout of CRISPR knockout screens for genetic vulnerabilities, and for the broad characterization of common and rare genetic variation in human health and disease.

UMAP plot of single-cell RNA sequencing data

Figure: UMAP plot of single-cell RNA sequencing data from over 300,000 A549 cells expressing a library of 100 P53 variants seen in humans. (Image credit: Ursu O. (2019) Broad Institute)

Moving forward, we are exploring whether other signature-based approaches, such as image-based optical profiling, can be used to complement our transcriptional profiling studies. Cellular morphology is a rich source of phenotypic information that can be used to interpret a wide range of normal and perturbed states, and recent advances in computational image analysis have dramatically increased the number of cellular morphological features that can be extracted in a single assay. We have recently demonstrated that these features can be used to classify cancer alleles by function at the single-cell level, using only fluorescent images as a data source4, at a small fraction of the cost of single-cell RNA sequencing. We are currently piloting approaches to compare image-based optical signatures to transcriptional signatures across a range of cellular perturbations to determine the degree to which these methods provide complementary and/or overlapping information about cellular biology.

In parallel, we are developing new methods to generate cancer variants de novo, using CRISPR-based strategies as an alternative to cDNA overexpression. In particular, we are utilizing CRISPR base editing, which enables the high-efficiency generation of DNA variants with single base pair resolution, without double-strand DNA cleavage5,6. These editors will enable us to engineer variants in an even larger fraction of the genome, including those in large genes and non-coding regions, which can be difficult to study using overexpression-based approaches. We aim to adapt these editors for use in a wide variety of cell types, including primary cell lines and organoids derived from patient tumors, in order to enable modeling of rare variants and cancer types that are not represented in traditional cell line collections.

Taken together, these efforts represent a significant step towards the implementation of a first-generation variant-to-function pipeline that will dramatically accelerate the interpretation of genomic variants in cancer and other genetic diseases. Over the next five years, we envision that such a pipeline will enable the generation and functional characterization of tens to hundreds of thousands of cancer variants, moving us closer to our ultimate goals of understanding the functional impact of all genetic variation in cancer, and making precision medicine a reality for all cancer patients.


  1. Kim E, Ilic N, Shrestha Y, et al. Systematic functional interrogation of rare cancer variants identifies oncogenic alleles. Cancer Discovery. 2016 Jul;6(7):714-26. (PMID: 27147599)
  2. Berger AH, Brooks AN, Wu X, et al. High-throughput phenotyping of lung cancer somatic mutations. Cancer Cell. 2017 Dec 11;32(6):884. (PMID: 27478040)
  3. Giacomelli AO, Yang X, Lintner RE, et al.  Mutational processes shape the landscape of TP53 mutations in human cancer. Nature Genetics. 2018 Oct;50(10):1381-1387. (PMID: 30224644)
  4. Rohban MH, Singh S, Wu X, et al.  Systematic morphological profiling of human gene and allele function via Cell Painting. Elife. 2017 Mar 18;6. (PMID: 28315521)
  5. Komor AC, Kim YB, Packer MS, et al.  Programmable editing of a target base in genomic DNA without double-stranded DNA cleavage. Nature. 2016 May 19;533(7603):420-4. (PMID: 27096365)
  6. Gaudelli NM, Komor AC, Rees HA, et al. Programmable base editing of A•T to G•C in genomic DNA without DNA cleavage. Nature. 2017 Nov 23;551(7681):464-471. (PMID: 29160308)

CGCI Program Highlights
Epstein-Bar Virus (EBV) Status Identifies Distinct Burkitt Lymphoma (BL) Phenotype in Pediatric Endemic and Sporadic BL

Nicholas Griner, Ph.D.
Office of Cancer Genomics, NCI
Icon for Cancer Genome Characterization Initiative

Burkitt Lymphoma (BL) is an aggressive type of B-cell non-Hodgkin lymphoma (NHL) first described in 1956 by Dr. Denis Burkitt in children from sub-Saharan Africa. Different subtypes of BL exist including endemic BL (eBL) and sporadic BL (sBL), amongst others. Geographic variability exists with BL subtypes; eBL being largely associated with children in malaria-prone regions. eBL is also strongly associated with the presence of Epstein-Barr virus (EBV); a known mechanism for development of lymphomas in BL. eBL patients are generally children who present with large tumors at the jaw and abdominal cavities. Treatment with chemotherapy largely cures many of these eBL patients where proper supportive care is available. However, eBL continues to be fatal in countries where access to proper care is limited, diagnosis at advanced stages is common and poverty/malnutrition is prevalent. sBL is a subtype of BL that occurs largely outside malaria-prone regions. Incidence for sBL is 10-fold lower than eBL1. sBL tumors commonly arise in abdominal and thoracic anatomic regions while large tumors of the jaw are rare. sBL is more common in adults in whom treatment can be more challenging despite considerable exploration by clinical trials. Thus, there is a continued need for better understanding of the genetic and molecular features of BL to facilitate the discovery of more effective treatments with lower toxicity.

While there have been many studies looking at different molecular pathways and genes, many of these studies are limited by small sample size, lack of background data such as geographic origin and other clinical data parameters necessary for holistic examination of cancer pathogenesis. The characteristic genetic hallmark of BL is constitutive MYC expression caused by a chromosomal translocation of MYC next to an immunoglobulin (IG) enhancer. However, additional genetic aberrations are necessary for BL to develop. EBV appears to play a strong role in the emergence of BL2though exact mechanisms are still unknown. In a recent publication in Blood, Grande and Gerhard et al.3 describes a large scale genomic and transcriptomic study of 106 pediatric BL cases (mostly from the Burkitt Lymphoma Genome Sequencing Project, BLGSP), comparing the molecular characteristics of EBV-positive and EBV-negative eBL and sBL tumors. The results support the requirement of EBV infection in eBL and detail important pathogenic differences and indicate potential roles of EBV in BL.

In this cohort, tumor EBV status grouped patients into different clinical subtypes irrespective of geographic origin. EBV-positive BL tumors had higher expression of the AICDA gene which codes for activation-induced enzyme cytidine deaminase (also known as AID). In addition, the study identified 70 genomic regions enriched for non-coding mutations. Many of these regions were associated with genes affected by aberrant somatic hypermutation (aSHM), a cellular process in which B-cells adapt the immune system’s immunoglobulin production to new foreign elements encountered by somatic hypermutation. Further examination identified AICDA expression correlating with the number of mutations in these non-coding regions. These findings suggest that non-coding mutations in discrete genomic regions are most likely due to the consequence of AICDA-mediated aSHM in EBV-positive tumors.

A number of additional non-coding mutation peaks not known to be targeted by aSHM were also identified; including a peak in the promoter of the PVT1 gene, a known target of MYC. These mutations were associated with the presence of EBV, but not with eBL status. Another peak was identified in a distal enhancer of PAX5, a transcription factor with a role in B-cell differentiation. These results suggest the possibility of AICDA contributing to BL by introducing non-coding mutations in regulatory regions.

Mutational processes in Burkitt lymphoma

Figure: Mutational processes in BL. (A) Mutation frequency is shown for each disease subtype. From top to bottom, the following simple somatic mutations (SSMs) are considered in each tumor: all genome-wide SSMs; SSMs outside non-coding mutation peaks; SSMs within peaks; and non-synonymous SSMs in all protein-coding genes. This analysis was restricted to WGS data from the BLGSP discovery cohort (N = 91). (B) Number of BL-associated genes (BLGs) that are mutated in each BLGSP discovery and validation case (N = 120). All mutation types were considered. Discordant cases are highlighted as red points. The number of mutated BLGs was compared using Mann–Whitney U tests (**, P-value < 0.001). (C) Estimated number of single nucleotide variants is shown per mutational signature for each disease subtype in the BLGSP discovery cohort (N = 91). The four de novo mutational signatures (BL sig.) are annotated with the associated COSMIC reference signature (COSMIC sig.). ICGC cases were excluded to avoid the possible confounding effect lower sequencing coverage. Significance brackets (panels A and C): *, Q-value < 0.1; **, Q-value < 0.001; ***, Qvalue < 0.00001 (Mann–Whitney U test). (Adapted from Grande, Gerhard et al., Blood. 2019. doi: 10.1182/blood-2018-09-871418)

EBV-positive cases had fewer mutated BL-associated genes (genes that are recurrently mutated in BL) per tumor but there was no difference in the number of BL-associated gene mutations when eBL was compared with sBL. Yet, the total mutation load per genome was significantly higher in eBL and EBV-positive cases. This distinction suggests increased accumulation of driver mutations in EBV-negative BL and corroborates the role of EBV infection in promoting tumorigenesis. The study also finds a disparity in the prevalence of mutations affecting apoptotic genes associated with EBV status, but not with geographic origin. These findings are consistent with EBV suppressing apoptosis in BL cells.

To further investigate the underlying mutational processes leading to BL, de novo mutational signatures were generated yielding four distinct signatures or patterns of mutations. Comparing these four signatures (A, B, C and D) with reference COSMIC (Catalog of Somatic Mutations in Cancer) signatures, the study found that the A, B,C and D signatures corelate respectively with aging, unknown etiology, defective DNA mismatch repair (MMR), and AICDA and polymerase ƞ activity (Figure).  BL signature C was significantly associated with tumor EBV status but not with clinical variant status, suggesting a link between EBV and DNA mismatch repair. BL signature D was strictly associated with AICDA expression while neither signatures B nor D was correlated with AICDA expression. Overall, this suggests the increased mutational load in EBV-positive cases may be due to defective DNA mismatch repair and increased AICDA activity.

The findings from this BLGSP study demonstrate genetic and molecular distinctions between EBV-positive and EBV-negative BL as well as the role that the EBV plays in BL tumorigenesis. The findings highlight the potential utility of DNA-damaging chemotherapy in BL patients with disrupted apoptosis or defective DNA mismatch repair. EBV appears to be a vulnerable therapeutic target since EBV-positive tumors are reliant on EBV expression. Additional functional assays including CRISPR-Cas9 screens and other small molecular combinatorial experiments will be necessary to further understand the exact mechanisms that distinguish EBV-positive and EBV-negative BL.

Researchers from multiple institutions within the Burkitt Lymphoma Genome Sequencing Project (BLGSP) contributed to this research.


  1.  Mbulaiteye SM, Pullarkat ST, Nathwani BN, et al. Epstein-Barr virus patterns in US Burkitt lymphoma tumors from the SEER residual tissue repository during 1979-2009. APMIS. 2014;122(1):5–15. (PMID: 23607450)
  2.  Crawford DH. Biology and disease associations of Epstein-Barr virus. Philos Trans R Soc Lond B Biol Sci. 2001; 356(1408): 461-473. (PMID: 11313005)
  3.  Grande BM, Gerhard DS, Jiang A, et al. Genome-wide discovery of somatic coding and non-coding mutations in pediatric endemic and sporadic Burkitt lymphoma. Blood. 2019; pii: blood-2018-09-871418. doi: 10.1182/blood-2018-09-871418. (PMID: 30617194)

CTD² Guest Editorial
Systems Cancer Immunology for the Masses

Erin F. Simonds, Ph.D., Edbert D. Lu, Ph.D., and William A. Weiss, Ph.D.
UCSF Departments of Neurology, Neurological Surgery, and Pediatrics
Erin F. Simonds, Ph.D., Edbert D. Lu, Ph.D., and William A. Weiss, Ph.D.

Cytometry by time-of-flight (CyTOF®), or generically, mass cytometry, is an antibody-based method of measuring cellular phenotypes in single-cell suspensions. This technology is unique in its ability to multiplex up to 44 distinct antibody-based markers, in millions of cells per day, at a low cost per cell, with minimal signal overlap between markers1. Mass cytometry achieves these improvements by using atomic mass spectrometry to measure metal-tagged antibodies, rather than lasers and fluorochrome-tagged antibodies, as in classical flow cytometry.

While mass cytometry is one of several available single-cell analysis platforms, it is particularly well-suited for immuno-oncology research. In immuno-oncology, researchers are often interested in the abundance and co-expression patterns of 30-50 markers of immune cell populations. Critically, these markers tend to be those for which flow cytometry-compatible antibodies have already been developed, therefore making them easily adaptable for mass cytometry. As mass cytometry can measure up to 44 antibody-based markers per cell, it opens up “systems cancer immunology” to the masses by formalizing and simplifying the analysis of diverse immune cell subsets across immune organs, tumor types, and drug treatments. For example, several groups have used mass cytometry to monitor T lymphocyte, B lymphocyte, natural killer, myeloid, and dendritic cell phenotypes simultaneously with 30-marker panels, leaving some bandwidth for novel or investigational markers. Consolidating all of these antibodies to a single staining cocktail simplifies sample processing and analysis across experimental conditions.

CyTOF® mass cytometry was developed in the mid-2000s by engineers at DVS Sciences in Toronto, Canada. In the years following, it has matured as a discipline with a wide user base. A third-generation instrument, the HeliosTM mass cytometer, was released by Fluidigm Corporation in 2015, a year after their acquisition of DVS Sciences. The first publications with this technology were in 2010 - 2011, in which mass cytometry was applied to studies of normal human immunology and hematopoiesis. Since 2015, numerous high-profile publications have featured mass cytometry analysis of a wide range of leukemias and cancer types, including mouse models of skin, colorectal, and breast cancer, as well as cohorts of human patients with AML, pre-B ALL, multiple myeloma, non-small cell lung cancer, clear cell renal carcinoma, melanoma, glioblastoma, and ovarian cancer.

Mass cytometry analysis of tumor and immune tissues informs cancer systems immunology

Figure: Mass cytometry analysis of tumor and immune tissues informs cancer systems immunology. The sample processing and antibody staining steps for mass cytometry are analogous to flow cytometry, except that metal-conjugated antibodies are used (top row). Inside the instrument, antibody-labeled cells are sprayed through a glass nebulizer and carried by a stream of argon into an inductively-coupled plasma (ICP) torch, where the metal atoms are liberated from the antibodies and directed as single-cell ion clouds into a mass spectrometer (middle row). The raw data from the mass spectrometer is converted into single-cell measurements of antibody abundance, which is then fed into a variety of computational tools to reveal cellular phenotypes and changes in marker expression (bottom row). (Image credit: Simonds EF. (2019) UCSF)

Mass cytometry produces data with a unique combination of high-dimensionality and high-throughput compared to other technologies for single-cell profiling. The workflow (Figure) is similar to flow cytometry: Samples must first be processed into single-cell suspensions (i.e., enzymatic digestion of solid tumors), which can then be either measured immediately, cryopreserved as viable cells, or fixed and frozen. Cells are then stained with metal-conjugated antibodies against target markers, as well as metal-based dyes for viability, DNA content, and sample identification (“barcoding”). The specialized processing needed for flow or mass cytometry analysis of solid tumors requires more hands-on time and expertise than banking formalin-fixed or frozen tissue blocks, which has limited the use of this technology for existing pathology sample archives. However, as the field of immuno-oncology grows, prospective banking of single-cell suspensions is becoming more standardized and commonplace, particularly in the setting of mouse studies and clinical trials.

One challenge in mass cytometry experimental design is that the decision of ~40 metal-tagged antibodies must be made a priori. Metal-conjugated antibodies to commonly used targets in immunology are available commercially; others must be learned from publications or conjugated in-house. Because mass cytometry is an antibody-based approach, the range of potential analytes is largely limited by the availability of commercial antibody clones. Like flow cytometry, fluorescent microscopy, or qPCR, this inherently creates a “lamppost” situation in which the scope of discovery is limited to the set of preselected targets. Other single-cell approaches that cast a wider net, such as single-cell RNA-sequencing or DNA-tagged antibodies, fill the need for a broader look at an experimental system. However, these less-biased methods tend to be significantly more expensive on a per-cell basis. For example, scRNA-seq using the 10X Chromium™ system costs approximately $0.10 per cell. In our hands, when using commercially available metal-conjugated antibodies and paying for use of a shared HeliosTM mass cytometer, a 40-parameter analysis of 100,000 cells costs about $300, or about $0.003 per cell. It is possible to further reduce costs by using metal-conjugated antibodies that are prepared in-house, although this requires more optimization. Access to mass cytometry is continually improving for institutions without the necessary instrumentation or expertise. Academic cores offering full-service mass cytometry sample preparation and analysis for outside users can now be found on four continents (a community-maintained list of cores can be found at Building on the innovative detection modality behind mass cytometry, there are now other approaches available that use metal-tagged antibodies. High-parameter imaging-based platforms such as the IONPath MIBIScope™ I and Fluidigm Hyperion™, offer the opportunity to use archived tissue blocks. These emerging platforms maintain tissue morphology and include information about the spatial distribution of cell subsets (i.e. in blood vessels, lymph nodes, areas of necrosis), which is lost in single-cell suspensions of solid tissue.

The most challenging hurdle in the field of mass cytometry is the process of data analysis. Fortunately, in the last decade, the workflow for mass cytometry data analysis has become more accessible and streamlined. Moreover, the unique needs of the field have inspired new approaches to analyze this type of data. Mass cytometry data can be thought of as a long table with a few dozen columns and millions of rows. A typical experiment may contain a table of this size for many patients or experimental conditions. In most cases, this scale of data exceeds conventional approaches for flow cytometry data analysis, such as dot plots or histograms. The breadth of up to 44 markers per cell also creates an opportunity to discover unanticipated combinations or expression patterns, so an unbiased analysis approach is often preferred. A popular strategy has been to reduce the dimensionality of the data by collapsing multiple columns into meta-parameters, while collapsing multiple rows into subpopulations of cells (“clusters”). This strategy can retain key information about distinct cell subsets, while making the data more interpretable by eye. Purpose-built clustering algorithms for mass cytometry such as PhenoGraph, FlowSOM, and SCAFFOLD have emerged. A popular method to view these clusters, or drill down to view the underlying single-cell data, is to organize them by similarity on a 2D plot using t-distributed stochastic neighbor embedding (t-SNE; implemented in the popular viSNE, cytofkit, and Cytosplore packages). With this plethora of data visualization and interpretation tools, it may be difficult to decide which ones to use. New mass cytometrists should assess which tools best fit their research needs before settling on an analysis pipeline. Several academic reviews on the subject are now available to guide new users, and the online community at Cytoforum serves as an interactive and historical record of advice.

Mass cytometry analysis of mouse tumors is a key component of an ongoing NCI-funded Cancer Target Discovery and Development (CTD2) project led by Drs. Allan Balmain, Matthew Krummel, and William Weiss at UCSF. This project uses a variety of advanced technologies, including mass cytometry, to compare the immune infiltrate in immune-competent mouse models of solid tumors, including squamous cell carcinoma, breast carcinoma, and glioblastoma. These tumor models were selected because they have distinct patterns of antigen presentation and likely different strategies to evade the immune system. Mass cytometry analysis of these tumors has revealed fine-grained subsets of different immune cell types within the tumor microenvironment and systemic effects in other immune tissues, such as lymph nodes. While much of the focus in immunotherapy has historically been on T lymphocytes, a major goal this research is to understand the diverse macrophage and dendritic cell populations within tumors, which are resolved in detail by the high-dimensional profiling of mass cytometry. Mass cytometry, and especially its application to syngeneic mouse models, is an invaluable tool for the field of systems cancer immunology. This approach of systematically comparing how the immune system behaves in the context of different tumor types and immunotherapies, across different organs and timepoints, will shed light on the underlying immune defects and inspire new approaches to restore anti-tumor immunity.

Mass cytometry complements other technologies, especially single-cell RNA sequencing, by offering rich, protein-level information on millions of cells. Over the last ten years, it has matured from a niche technology to a powerful and widely used tool, especially in the field of cancer immunology. There are now numerous review articles and online tools to help new users as they design custom-tailored antibody panels and analyze the data2,3,4. New technological tools, such as mass cytometry, coupled with appropriate mouse models of cancer, will help the field of systems cancer immunology decipher the web of cell types and responses as the immune system engages a tumor.


  1. Bendall SC, Nolan GP, Roederer M, Chattopadhyay PK. A deep profiler’s guide to cytometry.   Trends Immunol. 2012; (33): 323–332. (PMID: 22476049)
  2. Spitzer MH, Nolan GP. Mass Cytometry. Cell. 2016; 165(4): 780-79. (PMID: 27153492)
  3. Olsen LR, Leipold MD, Pedersen CB, Maecker HT. The anatomy of single cell mass cytometry data. Cytometry A. 2019; 95 (2): 156-172. (PMID: 30277658)
  4. Mistry AM, Greenplate AR, Ihrie RA, Irish JM. Beyond the message: advantages of snapshot proteomics with single-cell mass cytometry in solid tumors. FEBS J. 2018; doi: 10.1111/febs.14730 (PMID: 30549207)

OCG Perspective
Promoting Scientific Initiatives through Science Communication

Cindy Kyi, Ph.D.
Office of Cancer Genomics, NCI
Cindy Kyi, Ph.D.

I joined the Office of Cancer Genomics (OCG) as a fellow through NCI's Health Communications Internship Program (HCIP) after graduating with a Ph.D. in Biological Sciences from the University of Missouri, Columbia. As a graduate student, I studied the plasticity of neurons in major pelvic ganglia (MPG) involved in the micturition pathways in response to spinal cord injury. Although mobility is an apparent disability in spinal cord injured patients, autonomic functions such as micturition and defecation are the lesser known functions that are also compromised. My dissertation work focused on plasticity of neurotransmission at the MPG in response to loss of input. My work shed some light on how loss of neural inputs affects synaptic events such as response to neurotransmitters and the input-output dynamics of the MPG synapses.

As much as I enjoyed being a laboratory scientist, I realized after a few years in graduate school that I would like to support scientific projects from a different role: by facilitating day-to-day functions of scientific programs so that I can contribute to a broader context while staying abreast of scientific endeavors and findings. Specifically, I wanted to be involved in scientific communications since I realized the importance of making scientific facts and information widely available to the public. In this day and age of advanced technology and social media, one can find an enormous amount of information on any topic. While there are true facts and information from many reliable sources, there are also plenty of misleading or false information from non-credible sources. It is the responsibility of scientists, professionals, and educators to lead the public to accurate information based on facts and evidence. By communicating scientific findings to the non-expert general public, I aspire to take part in increasing scientific literacy and awareness of the public. Hence, I sought opportunities where I can apply my scientific background and improve skills in scientific communications.

As someone straight out of graduate school who would like to transition from the bench to facilitating scientific programs, the health communications fellowship at the OCG was a perfect opportunity for me. OCG currently has four main scientific programs which are supporting research to better understand molecular underpinnings of cancer with the goal towards precision oncology. OCG programs support the development of technology, tools, and human tissue-derived next-generation cancer models for rare, pediatric, and high-risk adult cancers.

OCG communications play important roles (as shown in the figure) in ensuring the continuous flow of information not only within the OCG and the NCI but also to the scientific community and the public through our web pages, newsletters, and social media. The OCG website functions as the main source of information on OCG programs, resources, guidelines to data usage, as well as news and publications.

OCG Communications

Figure: The diagram indicates various communications functions within the Office of Cancer Genomics. (SOPs = standard operating procedures)

As a health communications fellow, my goal is to maintain the accuracy, user-friendliness, and readability of our web content for the target audience, as well as to broaden our reach to researchers and the non-scientific audience. One of my main responsibilities is managing content on OCG webpages to ensure that the content is up-to-date, accurate, engaging, and functional. In addition, I contribute to developing content for the OCG e-Newsletter by suggesting topics, editing, and writing articles on topics that highlight the scientific technology and findings from OCG-supported programs. To make OCG program highlights and outcomes more visible and reach a broader audience, I am involved with drafting tweets for the NCI Genomics Twitter account. One of the important tasks is assessing website engagement through web analytics data because this provides us with tangible feedback on how our communications efforts are being utilized. As a part of communications team, I prepare bi-monthly (every two months) web-analytics reports and present data to the office staff. This process assists us in understanding web user engagement and formulating strategies to increase user traffic. My other duties include working with program managers to develop editorial guidelines and fact sheets, and presenting OCG initiatives and promoting OCG-supported resources at the National Institutes of Health (NIH) research festival and American Association for Cancer Research annual meeting.

Since the inception of my fellowship in August 2018, I have learned the integral role of communications in: 1) facilitating information flow within the NCI, as well as to extramural researchers involved in OCG-supported programs and to the general public; 2) the significance of precision in language and timely communication in carrying out scientific programs; and 3) the importance of multidisciplinary research in a disease such as cancer—a global adversary experienced across various races, sexes, and ages.

Despite its widespread nature, the heterogeneity of cancer makes effective treatment difficult. Cancer is diverse, and ‘personal’, and hence, it is important for the global research community to work together and come up with treatments that are tailored to a specific type of cancer, patient demographic, and clinical background. My current position at OCG allows me to appreciate the highly collaborative nature of scientific initiatives where researchers from around the world come together to fight against a common cause—cancer. I hope I can apply the broad communications skills that I am learning as an HCIP fellow in facilitating and managing global health and scientific research initiatives in the future.