Translational Genomics Research Institute: Quantified Cancer Cell Line Encyclopedia RNA-seq Data
Many applications analyze quantified transcript-level abundances to make inferences. Having completed this computation across the large sample set, the CTD2 Center at the Translational Genomics Research Institute presents the quantified data in a straightforward, consolidated form for these types of analyses.
After downloading RNA-seq data for 935 cell lines from the Cancer Cell Line Encyclopedia (CCLE), transcript-level abundance was quantified using Salmon1. All data were aligned using Salmon 0.4.2 using Homo Sapiens GRCh37.74 for reference. Raw BAM files used to generate this data is avaliable at GDC. The resulting 935 quantification files, named by sample ID, have 4 columns for ensemble gene ID, length, number of reads, and transcripts per million (TPM). Other Salmon arguments were "--libType IU" (inward, unstranded).
Access the Analyzed Data (DCC)
For questions, please contact Sen Peng.
- Patro R, et al. (2015). Accurate, fast, and model-aware transcript expression quantification with Salmon. https://dx.doi.org/10.1101/021592