Analytical Tools

The CTD² Network develops new approaches to identify novel targets and functionally validate discoveries made from large-scale genomic initiatives, such as The Cancer Genome Atlas (TCGA), Therapeutically Applicable Research to Generate Effective Treatments (TARGET), and the Cancer Genome Characterization Initiative (CGCI), and advance them toward precision medicine. Through robust cross-Network collaborations, CTD² (1) mines data to find alterations that potentially influence tumor biology, (2) characterizes the functional roles of candidate alterations in cancers, and (3) identifies novel approaches that target causative alterations either directly or indirectly. Methodologies include bioinformatics, genome-wide gain- and loss-of-function screening, and small molecule high-throughput screening, among others.

Part of the CTD² mission is to make data and tools available and accessible to the greater research community to accelerate the discovery process. Bioinformatics support is often required for analyses of the massive datasets used and generated through experimental pipelines employed by the Network Centers. To facilitate the processes of mining, visualizing, analyzing, and using such datasets, OCG has curated this collection of analytical tools. OCG/CTD² does not endorse any specific tool. However, this list gives researchers a gateway to access many tools that are useful for analyzing and/or visualizing large-scale genomic and/or complex datasets generated through high-throughput screens and other assays.

6 Analytical Tools
A | C | D | E | F | G | M | O | P | R | S | T | V


Master Regulator Inference algorithm (MARINa) (Columbia University)

MARINa is an algorithm that could be used to identfy transcription factors (TFs) that control the transition between two cellular phenotypes. Phenotypic changes effected by pathophysiological events are captured by gene expression profile measurements, determining mRNA abundance on a genome-wide scale in a cellular population. Furthermore, mRNA expression does not constitute a reliable predictor of protein activity, as it fails to capture a variety of post-transcriptional and post-translational events that are involved in its modulation. To negate this problem, MARINa computes the effect that enrichment of each regulon (i.e its activated and repressed targets) has on the differentially expressed genes between two phenotypic states.

For questions, please contact Andrea Califano: (
MD Anderson Cell Line Project (MCLP) Data Portal (University of Texas MD Anderson Cancer Center)

MCLP Data Portal is an interactive resource of proteomic, genomic, transcriptomic, and drug screening data of a large number of cancer cell lines. Protein expression levels (proteomic) were measured using the reverse phase protein array platform. This bioinformatic resource enables researchers to explore, analyze, and visualize protein expression data of cancer cell lines through four interactive modules: My Protein, Analysis, Visualization, and Data Sets.

For questions, please contact Han Liang: (
MethylMix (Stanford University)

MethylMix is an algorithm to identify hyper and hypomethylated genes for a disease. This approach uses a novel statistic, the Differential Methylation value or DM-value, to define methylation-driven subgroups. This could be used to identify differentially and transcriptionally predictive methylated genes within a disease by comparing with the normal DNA methylation state. Matched gene expression data is used to identify, besides differential, functional methylation states by focusing on methylation changes that affect gene expression.

For questions, please contact Olivier Gevaert: (
Mining Essentiality Data to Identify Critical Interactions for Cancer Drug Target Discovery and Development (MEDICI) (Emory University)

MEDICI is a computational method which ranks known protein-protein interactions (PPIs). This approach combines Project Achilles shRNA gene silencing data with network models of protein interaction pathways (NCI Pathway Interaction Database) in an analytic framework. The PPIs are ranked based on their essentiality for the survival and proliferation of cancer cells.

For questions, please contact Lee Cooper: (
Modular Analysis of Gene Networks In Cancer (MAGNETIC) (University of California San Francisco (1))

MAGNETIC is a bioinformatic approach that integrates multi-omic cancer patient data (e.g., somatic mutations, copy-number alterations, gene methylation, transcriptomes, proteomes, etc.) with pharmacogenomic data from cell lines. This tool performs functional network analysis to identify gene networks (modules) that are preserved in both cancer patients and cell lines. These modules connect tumor genotype to therapy and could be used for biomarker discovery.

For questions, please contact Sourav Bandyopadhyay: (
Modulator Inference by Network Dynamics (MINDy2)/ Conditional Inference of Network Dynamics (CINDy) (Columbia University)

MINDy2 and CINDy both infer modulatory events in the cell. They do this by screening a list of candidate modulator proteins and assessing their effect on the transcriptional control of a transcription factor of interest. CINDy uses a more sophisticated algorithm: while both try to assess the effects of a modulator over a transcriptional network, CINDy uses the entire expression range of the modulator.

For questions, please contact Andrea Califano: (