Analysis of Context-Specific Gene Dependencies for Target Discovery
Zeroing in on therapeutically relevant information in the flood of large-scale genomics data and translating it for clinical applications is an important goal for cancer research. The identification of clinically actionable information has been limited by a reductionist focus on individual genes and interactions rather than on broad genetic interaction networks. Systems approaches, including those supported by the National Cancer Institute’s Cancer Target Discovery and Development (CTD2) Network, are now being developed to provide a more holistic understanding of dysregulated genetic interaction networks in cancer. Among such strategies are network-based approaches that can be used to classify tumors and identify differential gene dependencies, which arise when gene expressions are inter-dependent in a context-dependent manner. Identifying gene dependencies that are unique to different subsets of tumors could lead to novel, actionable therapeutic targets for preclinical analyses. In this article, we describe three studies where we apply network-based approaches to gene expression data from the most common malignant brain tumor, glioblastoma multiforme (GBM). Through these analyses, we (1) more finely subdivided GBM subtypes, (2) revealed gene dependency networks that are specific to each of the four previously defined GBM subtypes, and (3) identified a “druggable” pathway uniquely dysregulated in one of these four subtypes.
Vignette 1: Unraveling molecular contexts
We recently developed a novel computational method to classify tumors into “molecular contexts.” This approach utilizes a mathematical model to identify changes in gene expression patterns that occur across different subsets of samples1-3. Here is a simplified example: the expression of a gene set may be compared across 100 samples. The expression pattern may remain unchanged, except in 20 of the samples where expression of a gene(s) influences major changes in expression patterns of a significant number of genes. These so-called dependencies for those 20 samples are considered a “context motif”. When looking at a large number of gene sets across many samples, algorithms can group context motifs into molecular contexts. We applied our method to the GBM gene expression data from The Cancer Genome Atlas (TCGA)4 and identified twelve distinct molecular contexts (Figure 1). These molecular contexts were then investigated to determine whether they confer any clinical characteristics, such as drug response and association with survival. More detailed characterization studies are ongoing to determine how the differences between these twelve molecular contexts may be exploited to develop more precise treatments.
Figure 1: Workflow to identify molecular contexts and assess clinical utility. Step 1: Gene expression data is interrogated to identify context motifs. Context motifs are conditional dependencies that occur when expression of a gene influences or changes the expression status of multiple genes for a particular set of samples. Step 2: Context motifs are joined together to construct a context motif graph. Step 3: A graph-clustering algorithm groups context motifs into molecular contexts. Finally, each molecular context is assigned to a previously defined GMB subtype and interrogated for its translational utility using readouts such as drug response or survival.
Vignette 2: Revealing subtype-specific gene dependencies
Determining molecular interactions specific to cancer subtypes or other cancer-specific conditions (e.g., mutation status of a gene) can provide a more comprehensive understanding of the disease. For this analysis, we developed a network-based method that maps cancer-generic and subtype-specific transcriptional dependencies between contextual gene sets. These are sets of genes with similar expression patterns across a subset of samples. We used this method to analyze the TCGA GBM gene expression data and constructed a contextual gene set dependency network comprised of 247 contextual gene sets and 296 dependencies (Figure 2; gene sets are represented by the small filled circles called “nodes” and dependencies are denoted by the lines connecting them). The network map, once constructed, identified dependencies that were associated with generic GBM (Figure 2A) and others that were specific to each of the four GBM subtypes (Classical, Figure 2B and C; Mesenchymal, Figure 2D; Proneural, Figure 2G and F; and Neural, Figure 2E). Dependencies specific to other conditions, including Epidermal Growth Factor Receptor (EGFR) mutation status, O6-methylguanine-DNA-methyltransferase (MGMT) methylation, and patient age <40 years, were also identified (Figure 2H, I, J, K, L, M dashed gold boxes)4.
Figure 2: The GBM contextual gene set dependency network. The contextual gene sets under-expressed across their corresponding conditions are green and over-expressed contextual gene sets are red.
Vignette 3: Discovery of subtype-specific targets
There have been numerous efforts to compare differential dependencies across tumor subtypes to identify subtype-specific targets. Most of the methods developed to uncover differential pathway dependencies and gene interactions focus on individual interactions5-8 or condition-specific sub-networks9-12. Often, these methods fail to account for underlying redundancy in biological processes, such as two genes performing the same cellular function. There is also uncertainty in data that are used to assess the likelihood of a gene dependency network that are due to the inherent effects of the technologies used to generate it. To address these shortcomings, we developed a novel network-based algorithm, EDDY: Evaluation of Differential DependencY. EDDY uses a probabilistic approach that more reliably identifies differential patterns of pathway dysregulation between tumor subtypes13.
EDDY was used to analyze the GBM gene expression data from TCGA and identified 10-22 gene sets specific to each subtype. When comparing the Proneural gene set to those of other GBM subtypes, EDDY identified dependencies that were uniquely Proneural, even though traditional gene expression analyses did not show differential expression for the subset of dependencies identified by EDDY (Figure 3A). This implies that the dependency relationships between genes can be significantly different across subtypes even when the overall expression of individual genes is not clearly different.
Using EDDY, we found that dysregulation of the G2 checkpoint pathway genes was unique to Proneural GBM (Figure 3B). The G2 is a DNA damage checkpoint, during which cells ensure that DNA has been properly replicated before they proceed with cellular division. In the Proneural subtype, G2 checkpoint pathway dependency was also linked to enrichment of mutations in TP53, a tumor suppressor and cell cycle regulator (Figure 3B). WEE1 kinase plays a critical role in G2 checkpoint regulation, so we tested the ability of the WEE1 kinase inhibitor, AZ1775, to inhibit growth of in vitro GBM patient-derived xenograft cell lines with Proneural context and known TP53 status. Interestingly, TP53 mutant GBM cells showed higher sensitivity to WEE1 inhibition as compared to TP53 wild-type GBM (Figure 3C). Understanding this differential dependency network may provide insight as to what mechanisms underlie vulnerability in TP53 mutant GBM and therapeutic resistance in TP53 wild-type GBM.
Figure 3: EDDY was used to analyze differential dependencies in GBM. (A) A subset of genes that EDDY identified did not show differential expression between Proneural and non-Proneural GBM subtypes. (B) Differential dependencies in the G2 checkpoint pathway were identified between Proneural and non-Proneural GBM subtypes. (C) In vitro cultures of xenografts derived from GBM patients show differential sensitivity to AZ1775. Sensitivity is dependent on mutation status of TP53 (TP53 wild-type is not sensitive). Mutated TP53 is a genetic characteristic of the GMB Proneural subtype.
The three studies described above provide examples of how network-based applications can be used to uncover differential gene dependencies associated with GBM subtypes. In the future, we hope to expand our analyses to additional cancers in large-scale genomic data repositories such as TCGA. Using these novel network-based approaches to analyze the wealth of genomic and clinical data available through TCGA may lead to the discovery of novel classifiers and drug vulnerabilities to ultimately benefit cancer patients.
- Kim S, Sen I, Bittner ML. (2007) Mining molecular contexts of cancer via in-silico conditioning. Computational Systems Bioinformatics 6, 169-79 (PMID: 17951822)
- Sen I, Verdicchio M, Jung S, Trevino R, Bittner M, Kim S. (2009) Context-Specific Gene Regulations in Cancer Gene Expression Data. Pacific Symposium on Biocomputing 75-86 (PMID: 19213132)
- Ramesh AR, Trevino R, Von Hoff D, Kim S. (2010) Clustering Context-Specific Gene Regulatory Networks. Pacific Symposium on Biocomputing 444-455 (PMID: 19908396)
- Verhaak RGW, Hoadley KA, Purdom E, Wang V, Qi Y, et al. (2010) Integrated genomic analysis identifies clinically relevant subtypes of glioblastoma characterized by abnormalities in PDGFRA, IDH1, EGFR, and NF1. Cancer Cell 17, 98-110 (PMID: 20129251)
- Hu R, Qiu X, Glazko G, Klebanov L, Yakovlev A. (2009) Detecting intergene correlation changes in microarray analysis: a new approach to gene selection. BMC Bioinformatics 10, 20 (PMID: 19146700)
- Lai Y, Wu B, Chen L, Zhao H. (2004) A statistical method for identifying differential gene-gene co-expression patterns. Bioinformatics 20, 3146-55 (PMID: 15231528)
- Leonardson AS, Zhu J, Chen Y, Wang K et al. (2010) The effect of food intake on gene expression in human peripheral blood. Human Molecular Genetics 19, 159-69 (PMID: 19837700)
- Mentzen W, Floris M, de la Fuente A. (2009) Dissecting the dynamics of dysregulation of cellular processes in mouse mammary gland tumor. BMC Genomics 10, 601 (PMID: 20003387)
- Guo Z, Li Y, Gong X, Yao C, et al. (2007) Edge-based scoring and searching method for identifying condition-responsive protein-protein interaction sub-network. Bioinformatics 23, 2121-2128 (PMID: 17545181)
- Hwang T, Park T. (2009) Identification of differentially expressed subnetworks based on multivariate ANOVA. BMC Bioinformatics 10, 128 (PMID: 19405941)
- Kim Y, Kim T-K, Kim Y, Yoo J, You S, et al. (2010) Principal network analysis: Identification of subnetworks representing major dynamics using gene expression data. Bioinformatics 27, 391-398 (PMID: 21193522)
- Ma H, Schadt EE, Kaplan LM, Zhao H. (2011) COSINE: COndition-SpecIfic sub-NEtwork identification using a global optimization method. Bioinformatics 27, 1290-1298 (PMID: 21414987)
- Jung S, Kim S. (2014) EDDY: a novel statistical gene set test method to detect differential genetic dependencies. Nucleic acids research 42, e60. (PMID: 24500204)
Content on this page maintained by the Office of Cancer Genomics