Transcriptome Sequencing

Data Generation Protocols Data Analysis Protocols
CaCxIcon for HTMCP- Cervical Cancer Project CaCxIcon for HTMCP- Cervical Cancer Project

Data Generation Protocols

The data generation protocols for the HTMCP-Cervical Cancer project were acquired from the following manuscript.

Gagliardi A, Porter VL, Zong Z, et al. Analysis of Ugandan cervical carcinomas identifies human papillomavirus clade-specific epigenome and transcriptome landscapes. Nat Genet. 2020;52(8):800-810. (PMID: 32747824)

PolyA RNA library construction

Polyadenylated (PolyA+) messenger RNA was purified using the 96-well MultiMACS mRNA isolation kit on the MultiMACS 96 separator (Miltenyi Biotec, Germany) from 3 µg total RNA with on-column DNaseI-treatment as per the manufacturer's instructions. Eluted PolyA+ RNA was ethanol precipitated and resuspended in 10 µL of DEPC treated water with 1:20 SuperaseIN (Life Technologies, USA).

First-strand cDNA was synthesized from the purified messenger RNA using the Maxima H Minus First Strand cDNA Synthesis kit (Thermo-Fisher, USA) and random hexamer primers with 0.4 µg/µL Actinomycin D, followed by PCR Clean DX bead purification on a Microlab NIMBUS robot (Hamilton Robotics, USA). The second strand cDNA was synthesized following the NEBNext Ultra Directional Second Strand cDNA Synthesis protocol (NEB) that incorporates dUTP in the dNTP mix, allowing the second strand to be digested using USERTM enzyme (NEB) in the post-adapter ligation reaction and thus achieving strand specificity.

cDNA was fragmented by sonication (Covaris) to achieve 200-250 bp fragment lengths. Sheared cDNA was then subject to end-repair and phosphorylation in a single reaction using enzyme premix (NEB) containing T4 DNA polymerase, Klenow DNA Polymerase and T4 polynucleotide kinase. Repaired cDNA was purified and 3’ A-tailed (adenylation) using Klenow fragment (3’ to 5’ exo minus). Illumina PE adapters were ligated. The adapter-ligated products were purified using PCR Clean DX beads, then digested with USERTM enzyme (1 U/µL, NEB) followed immediately by 13 cycles of indexed PCR using Phusion DNA Polymerase (Thermo Fisher Scientific Inc. USA) and Illumina’s PE primer set. PCR products were purified and size-selected using a 1:1 PCR Clean DX beads-to-sample ratio (twice), and the eluted DNA quality was quantified prior to library pooling and size-corrected final molar concentration calculation.

Transcriptome sequencing

Transcriptomes were sequenced using 75bp paired-end reads on Illumina HiSeq2500.

Experimental protocols

To request more information or approval regarding the following protocols, please contact BC Cancer at labqa@bcgsc.ca

96-well Plate-based Strand-specific cDNA Synthesis using Maxima H Minus on Hamilton NIMBUS

Nimbus-assisted 96-well PCR-enriched Library Construction for Illumina Sequencing

Magnetic bead-based mRNA isolation 

miRNA3 - Plate Format miRNA Library Construction

Data Analysis Protocol

The data analysis protocols for the HTMCP-Cervical Cancer project were acquired from the following manuscript.

Gagliardi A, Porter VL, Zong Z, et al. Analysis of Ugandan cervical carcinomas identifies human papillomavirus clade-specific epigenome and transcriptome landscapes. Nat Genet. 2020;52(8):800-810. (PMID: 32747824)

Expression profiling

RNA-Seq reads were aligned to the human reference genome (hg19) with BWA-MEM (v0.7.6a). Reads aligned to exon junctions were repositioned in the genome as large-gapped alignments using in-house software (JAGuaR v1.7.5). Unambiguously aligned, filtered reads were used to calculate coverage over the total collapsed exonic regions in each gene as annotated in Ensembl v69, and RPKM values were calculated for each gene. Structural variants in RNA Seq data were identified using the assembly-based tools ABySS v1.3.4 and TransABySS v1.4.10 and alignment-based tool DeFuse (v0.6.2).

Gene expression and gene ontology enrichment analyses

Clustering analysis was performed with ConsenusClusterPlus (v1.38.0, R) using log10(RPKM) values using the ‘Pearson’ method and ‘ward.D2’ linkage with 1000 iterations. Human genes used included the top 1,000 most variable genes (RPKM > 5 in at least one patient). All 118 samples were included in human gene clustering and 117 in viral gene clustering (no gff file was available for HPV51).

Differential gene expression between groups (A7 vs. A9; E6 and E7 high vs. low) was performed using the DESeq2 (v1.14.1, R). Genes were filtered using an adjusted p-value <0.05, >1.5-fold change in mean expression, and a baseMean expression >1000. For the A7 vs. A9 comparison, the differential analysis was normalized for histology using a multifactorial approach. Results from the normalized analysis were compared to those using only squamous A7 and A9 samples to ensure histology correction was only removing expression differences attributed to histologies (89% concordance).

Functional enrichment of the significantly differentially expressed genes in the A7 vs. A9 comparison was performed using STRING (v11.0). For visualisation, enrichment scores for A7-enriched ontologies were set to negative values. Functional enrichment of the significantly differentially expressed upregulated and downregulated gene lists for the E6 and E7 analysis was performed using HOMER (v4.10.3).

Estimation of immune cell content

To quantify expression signatures of leukocytes in our samples, we ran CIBERSORT (v1.0.4) on the expression RPKMs using -p 500, -q and -a options and the LM22 signature matrix provided. Total CD4+ T-cell content is the sum of the scores from the CD4+ T cells in CIBERSORT; naive, memory resting, memory activated, follicular helper and regulatory T cells.

Last updated: August 07, 2020