Copy Number

Data Generation Protocols Data Analysis Protocols
Gene Chip® Human Mapping 500K Array (Affymetrix) ALL P1ALL symbol
Genome-Wide Human SNP Array 6.0 (Affymetrix) ALL P1/ P2ALL symbol , AMLAML symbol , CCSKKidney tumor symbol , PPTPPPTP symbol , WTKidney tumor symbol , OSOsteosarcoma symbol
HumanHap 550K Beadchip (Illumina) NBLNeuroblastoma symbol
Human Omni5 BeadChip Kit (Illumina) PPTPPPTP symbol

Gene Chip® Human Mapping 500K Array (Affymetrix) for Acute Lymphoblastic Leukemia Phase I (ALL P1)

*Protocol performed at St. Jude Children's Research Hospital.

DNA was extracted using QIAGEN QIAamp DNA Mini Kit according to manufacturer’s protocol.   

Nucleic acid labeling, hybridization and array scanning protocols were used according to Affymetrix manufacturer’s protocol for Affymetrix Mapping 250k or Affymetrix Genomewide SNP6 arrays at St. Jude’s Children’s Research Hospital.               

Normalization data transformation protocols were carried out at St. Jude’s Children’s Research Hospital as follows: 250K genotypes were generated using the BRLMM algorithm implemented in GTYPE (Affymetrix). SNP6 genotypes were generated using the birdseed v2 algorithm in Genotyping Console (Affymetrix). Samples that failed standard QC metrics (contrast QC) were excluded.  To generate copy number data, data were analyzed using a extensively used and validated algorithm developed at St Jude Children’s Research Hospital.  Affymetrix SNP array CEL files (Level 1 data) and SNP call files (either .CHP or .TXT files; Level 2 data) were imported into dChip and probe level values summarized1. Data were exported and normalized using the reference normalization algorithm2. This algorithm uses user supplied or computationally detected diploid chromosomes to guide normalization of the entire array on a sample-by-sample basis, and optimizes normalization of complex cancer samples while eliminating batch effects. This procedure generates the .cnmz file that includes both the summarize, normalized prove intensities and the genotype data.              

Optimally normalized data were subjected to paired circular binary segmentation3 with thresholds set to detect copy number segments of >2.3 or <0.7 copies and at least 5 markers (250K data) or 8 markers (SNP6 data). Raw copy number segmentation results inspected and curated in dChip.

References:

  1. Lin M. et al. (2004) dChipSNP: significance curve and clustering of SNP-array-based loss-of-heterozygosity data. Bioinformatics 20, 1233-40 (PMID: 14871870)
  2. Pounds S. et al. (2009). Reference alignment of SNP microarray signals for copy number analysis of tumors. Bioinformatics 25, 315-21 (PMID: 19052058)
  3. Venkatraman E.S. et al. (2007) A faster circular binary segmentation algorithm for the analysis of array CGH data. Bioinformatics 23, 657-63 (PMID: 17234643)

Genome-Wide Human SNP Array 6.0 (Affymetrix) for Acute Lymphoblastic Leukemia Phases I & II (ALL P1, ALL P2)

*Protocol performed at St. Jude Children’s Research Hospital.

DNA was extracted using QIAGEN QIAamp DNA Mini Kit according to manufacturer’s protocol. 

Nucleic acid labeling, hybridization and array scanning protocols were used according to Affymetrix manufacturer’s protocol for Affymetrix Mapping 250k or Affymetrix Genomewide SNP6 arrays at St. Jude’s Children’s Research Hospital.             

Normalization data transformation protocols were carried out at St. Jude’s Children’s Research Hospital as follows: 250K genotypes were generated using the BRLMM (Bayesian Robust Linear Model with Mahalanobis) algorithm implemented in GTYPE (Genotyping Analysis Software, Affymetrix). SNP6 genotypes were generated using the birdseed v2 algorithm in Genotyping Console (Affymetrix). Samples that failed standard quality control metrics (contrast quality control) were excluded.  To generate copy number data, data were analyzed using a extensively used and validated algorithm developed at St Jude Children’s Research Hospital.  Affymetrix SNP array CEL files (Level 1 data) and SNP call files (either .CHP or .TXT files; Level 2 data) were imported into dChip and probe level values summarized1. Data were exported and normalized using the reference normalization algorithm2. This algorithm uses user supplied or computationally detected diploid chromosomes to guide normalization of the entire array on a sample-by-sample basis, and optimizes normalization of complex cancer samples while eliminating batch effects. This procedure generates the .cnmz file that includes both the summarize, normalized prove intensities and the genotype data.                

Optimally normalized data were subjected to paired circular binary segmentation3 with thresholds set to detect copy number segments of >2.3 or <0.7 copies and at least 5 markers (250K data) or 8 markers (SNP6 data). Raw copy number segmentation results inspected and curated in dChip.

References

  1. Lin M. et al. (2004) dChipSNP: significance curve and clustering of SNP-array-based loss-of-heterozygosity data. Bioinformatics 20, 1233-40 (PMID: 14871870)
  2. Pounds S. et al. (2009). Reference alignment of SNP microarray signals for copy number analysis of tumors. Bioinformatics 25, 315-21 (PMID: 19052058)
  3. Venkatraman E.S. et al. (2007) A faster circular binary segmentation algorithm for the analysis of array CGH data. Bioinformatics 23, 657-63 (PMID: 17234643)

Genome-Wide Human SNP Array 6.0 (Affymetrix) for Acute Myeloid Leukemia (AML)

*Protocols performed at the Fred Hutchinson Cancer Research Center.

All genotyping was performed according to manufacturer’s protocol.  Briefly, two identical aliquots containing 250ng of DNA were digested with specific restrictions enzymes in separate reactions; one reaction contained Nsp1 and the other Sty1.  Immediately following digestion, each sample was ligated with adaptors containing a complementary sequence to the overhang generated at digestion.  Following ligation, each sample was subjected to PCR amplification using standard reagents.  Following PCR, each sample was assayed on a 2% agarose gel to ensure that a DNA smear of appropriate size was produced.  The Nsp and Sty amplifications were combined, purified and quantitated.  All samples with at least 180 micrograms total DNA were allowed to continue to fragmentation using the enzymatic reaction  Affymetrix Fragmentation reagent.  The fragmented DNA was assayed on a 4% agarose gel to ensure that the size of the DNA collapsed to less than 75nt.  Following fragmentation, the DNA was end-labeled with terminal deoxy transferase and Affymetrix DNA labeling reagent. 

Nucleic acid hybridization was performed according to the manufacturer's protocol for the AffyMetrix 6.0 SNP array.  Each sample was then resuspended in hybridization buffer and hybridized to the Affymetrix 6.0 array for 16 hours. Following hybridization, the arrays were washed on the Affymetrix Fluidics station and scanned on the GeneChip scanner.

The array scanning protocol was performed according to the manufacturer's protocol for the AffyMetrix 6.0 SNP array.

All data were processed using the standard analysis suite provided by Affymetrix.  The QC call rate is developed for each sample using a subset of SNPs and the DM algorithm.  A QC call rate of greater than 87% is a passing score for Affymetrix, the average call rate for this dataset was 99.4%. Samples passing the QC call rate are then clustered using the Birdseed algorithm.  Individual data files (CEL files) were uploaded to Partek Genomics Suite (St. Louis, MO).  Using a paired analysis (each patient’s remission samples was used as the reference), copy number was calculated for each probeset and is indicated in the level 2 file, TARGET_AML_level2_paired_CN_log2_format.txt.  

To find areas of the genome amplified or deleted, the Partek segmentation algorithm was applied to the level 2 dataset.

The unfiltered copy number segmentation file, TARGET_AML_CN_level3_unfiltered_Diagnostic.txt, contains all segments for each patient, both changed and unchanged, with no filtering parameters applied.  To reduce the number of false positives and filter out segments of the genomes that are unchanged, TARGET_AML_CN_level3_filtered_Diagnostic.txt, contains only segments with <1.7 or >2.3 copy number, have >99 markers and a p-value <0.05.  In addition, segments from the Y chromosome and mitochondrial genome were removed.

Genome-Wide Human SNP Array 6.0 (Affymetrix) for Clear Cell Sarcoma of the Kidney (CCSK)

Nucleic acid labeling, hybridization, and array scanning were performed on 11 CCSKs according to the manufacturer’s protocol for the Affymetrix 6.0 SNP array (Affymetrix, Santa Clara, CA, USA) and processed with the Affymetrix Genotyping Console (GTC) 4.0 software. Reference normalization was performed as described by Pounds et al1. Circular binary segmentation (CBS) was performed using DNAcopy from BioConductor. Segmented regions of autosomal chromosomes containing at least 8 markers in which the log2 value was > +0.5 or < -0.5 were considered regions of gain or loss, respectively. For the other 2 CCSK samples, copy number was assessed by using relative coverage generated by whole genome sequencing.

Specifically:

*Protocol performed at Ann and Robert H. Lurie Children’s Hospital and St. Jude’s Children’s Research Hospital.

DNA was extracted from normal kidney, tumor, or blood samples at Nationwide Children's BioPathology Center (BPC) by using the standard BPC protocol. Pico green analysis was performed to verify concentration of gDNA.  Spectrophotometry was performed to verify DNA purity and gel electrophoresis was performed to verify DNA quality. Tumor and corresponding normal specimens (blood and/or normal kidney) were supplied to St. Jude Children's Research Hospital on 96-well plates allowing for the inclusion of two controls.

Nucleic acid labeling, hybridization and array scanning protocols were performed according to the Affymetrix manufacturer's protocol for the AffyMetrix 6.0 SNP array at St Jude's Children's Research Hospital.

Data were provided by  St Jude's Children's Research Hospital in the Affymetrix CEL file format (Level 1 data) and the CEL files were processed using AffyMetrix Genotyping Console (GTC) 4.0 software to generate corresponding Birdseed .chp and .txt files (Level 2 data) by using the Birdseed v2 algorithm with the default parameters. Several quality control parameters were used:

  • Contrast QC (quality control): The average contrast QC was 1.83 for all samples, which is above the minimal of 1.7 recommended by AffyMetrix. Less than 10% of samples had a Contrast QC <0.4, and those samples with contrast QC <0.4 were deemed acceptable based on their heterozygosity values and Birdseed call rates.
  • DNA gender check: Samples were classified into genders using AffyMetrix Genotyping Console software; no inconsistencies were noted. Only 0.03% of all samples could not be classified according to gender (“unknown”); all of these samples were tumor samples in which the gender of the corresponding normal sample was called correctly.
  • Sample Call Rate:  AffyMetrix GTC 4.0 software was used to check the calling rate of constitutional DNA samples and all samples had calling rates greater than the cut-off of >95.5% (range, 94.1–99.5%; mean, 97.9%).  Furthermore, the calling rate of tumor samples ranged from 93.4–97.3% (mean, 97.4%).
  • DNA Autosomal Heterozygosity rate: The percentage of heterozygous SNPs among all measured SNPs was determined per sample using AffyMetrix GTC 4.0 software. The heterozygosity rates of normal samples ranged from 24–32%, which is within normal limits. This rate, which is expected to be lower for tumors compared to normals, ranged from 15–32% in our tumor samples.
  • Normalization: The reference normalization procedure utilized for our data normalization relies on an algorithm developed at St. Jude that utilizes a diploid chromosome for each sample to guide data normalization, as described1. In the first step, the CEL files (Level 1 data) and Birdseed.txt files (Level 2 dta) are read into dChip and model-based expression analysis (MBEI) is performed to generate probe level summarization values for each individual probe. This results in a file containing two columns for each individual sample: (1) the summarized probe value and (2) the genotype call. This file containing un-normalized data is exported from dChip as a text file and imported into R for reference normalization according to Pounds et al1. This algorithm requires two input files: (1) the dChip output file described above and (2) a text file defining each SNP on the AffyMetrix 6.0 chip according to chromosome and location. The reference chromosome for each sample was selected by using Nexus 6.0 software.  The reference normalization algorithm provides an output text file containing two columns for each sample: (1) the normalized probe value and (2) the genotype call.

Circular binary segmentation (CBS) was then applied to the output files in order to obtain segmented copy number information. This was performed in R using the DNAcopy BioConductor package. First, the log (base=2) of the ratios of each tumor sample's signal values over the signal values of the corresponding normal samples was calculated. After detecting outliers and smoothing the log ratio signal data, CBS was applied to segment the data into regions of estimated equal copy number. CBS was performed using default parameters including “nperm = 10,000”, “alpha=0.01”,”undo.splits=sdundo”, and “undo.SD=1”. This algorithm resulted in a segmented file for each tumor sample relative to the corresponding normal sample.

References

  1. Pounds S. et al. (2009). Reference alignment of SNP microarray signals for copy number analysis of tumors. Bioinformatics 25, 315-21 (PMID: 19052058)

Genome-Wide Human SNP Array 6.0 (Affymetrix) for Pediatric Preclinical Testing Program (PPTP)

*Protocol performed at Nationwide Children’s Hospital.

The DNA extraction was performed according to the Qiagen manufacturer's protocol (DNeasy Kit) in combination with Trizol.   

All genotyping for the Genome-wide Human SNP array 6.0 was performed according to Affymetrix manufacturer’s protocol.  Briefly, two identical aliquots containing 250 ng of DNA were digested with specific restrictions enzymes in separate reactions; one reaction contained Nsp1 and the other Sty1.  Immediately following digestion, each sample was ligated with adaptors containing a complementary sequence to the overhang generated at digestion.  Following ligation, each sample was subjected to PCR amplification using standard reagents.  Following PCR, each sample was assayed on a 2% agarose gel to ensure that a DNA smear of appropriate size was produced.  The Nsp and Sty amplifications were combined, purified and quantitated.  All samples with at least 180 µg total DNA were allowed to continue to fragmentation using the enzymatic reaction Affymetrix Fragmentation reagent.  The fragmented DNA was assayed on a 4% agarose gel to ensure that the size of the DNA collapsed to less than 75nt.  Following fragmentation, the DNA was end-labeled with terminal deoxy transferase and Affymetrix DNA labeling reagent.            

Nucleic acid hybridization for the Genome-wide Human SNP array 6.0 was performed according to the manufacturer's protocol for the AffyMetrix 6.0 SNP array.  Each sample was then resuspended in hybridization buffer and hybridized to the Affymetrix 6.0 array for 16 hours. Following hybridization, the arrays were washed on the Affymetrix Fluidics station and scanned on the GeneChip scanner.         

The array scanning protocol for the Genome-wide Human SNP array 6.0 was performed according to the manufacturer's protocol for the AffyMetrix 6.0 SNP array.           

Genome-Wide Human SNP Array 6.0 (Affymetrix) for Wilm's Tumor (WT)

*Protocol performed at Ann and Robert H. Lurie Children’s Hospital and St. Jude’s Children’s Research Hospital.

DNA was extracted from normal kidney, tumor, or blood samples at Nationwide Children's BioPathology Center (BPC) by using the standard BPC protocol. Pico green analysis was performed to verify concentration of gDNA.  Spectrophotometry was performed to verify DNA purity and gel electrophoresis was performed to verify DNA quality. Tumor and corresponding normal specimens (blood and/or normal kidney) were supplied to St. Jude Children's Research Hospital on 96-well plates allowing for the inclusion of two controls.

Nucleic acid labeling, hybridization and array scanning protocols were performed according to the Affymetrix manufacturer's protocol for the AffyMetrix 6.0 SNP array at St Jude's Children's Research Hospital.

Data were provided by  St Jude's Children's Research Hospital in the Affymetrix CEL file format (Level 1 data) and the CEL files were processed using AffyMetrix Genotyping Console (GTC) 4.0 software to generate corresponding Birdseed .chp and .txt files (Level 2 data) by using the Birdseed v2 algorithm with the default parameters. Several quality control (QC) parameters were used.

  • Contrast QC: The average contrast QC was 1.83 for all samples, which is above the minimal of 1.7 recommended by AffyMetrix. Less than 10% of samples had a Contrast QC <0.4, and those samples with contrast QC <0.4 were deemed acceptable based on their heterozygosity values and Birdseed call rates.
  • DNA gender check: Samples were classified into genders using AffyMetrix Genotyping Console software; no inconsistencies were noted. Only 0.03% of all samples could not be classified according to gender (“unknown”); all of these samples were tumor samples in which the gender of the corresponding normal sample was called correctly.
  • Sample Call Rate:  AffyMetrix GTC 4.0 software was used to check the calling rate of constitutional DNA samples and all samples had calling rates greater than the cut-off of >95.5% (range, 94.1–99.5%; mean, 97.9%).  Furthermore, the calling rate of tumor samples ranged from 93.4–97.3% (mean, 97.4%).
  • DNA Autosomal Heterozygosity rate: The percentage of heterozygous SNPs among all measured SNPs was determined per sample using AffyMetrix GTC 4.0 software. The heterozygosity rates of normal samples ranged from 24–32%, which is within normal limits. This rate, which is expected to be lower for tumors compared to normals, ranged from 15–32% in our tumor samples.
  • Normalization: The reference normalization procedure utilized for our data normalization relies on an algorithm developed at St. Jude that utilizes a diploid chromosome for each sample to guide data normalization, as described1. In the first step, the CEL files (level 1 data) and Birdseed.txt files (Level 2 data) are read into dChip and model-based expression analysis (MBEI) is performed to generate probe level summarization values for each individual probe. This results in a file containing two columns for each individual sample: (1) the summarized probe value and (2) the genotype call. This file containing un-normalized data is exported from dChip as a text file and imported into R for reference normalization according to Pounds et al1. This algorithm requires two input files: (1) the dChip output file described above and (2) a text file defining each SNP on the AffyMetrix 6.0 chip according to chromosome and location. The reference chromosome for each sample was selected by using Nexus 6.0 software.  The reference normalization algorithm provides an output text file containing two columns for each sample: (1) the normalized probe value and (2) the genotype call.

Circular binary segmentation (CBS) was then applied to the output files in order to obtain segmented copy number information. This was performed in R using the DNAcopy BioConductor package. First, the log (base=2) of the ratios of each tumor sample's signal values over the signal values of the corresponding normal samples was calculated. After detecting outliers and smoothing the log ratio signal data, CBS was applied to segment the data into regions of estimated equal copy number. CBS was performed using default parameters including “nperm = 10,000”, “alpha=0.01”,”undo.splits=sdundo”, and “undo.SD=1”. This algorithm resulted in a segmented file for each tumor sample relative to the corresponding normal sample.

References

  1. Pounds S. et al. (2009). Reference alignment of SNP microarray signals for copy number analysis of tumors. Bioinformatics 25, 315-21 (PMID: 19052058)

Genome-Wide Human SNP Array 6.0 (Affymetrix) for Osteosarcoma (OS)

*Protocol performed at Nationwide Children’s Hospital (nucleic acid extraction), Children’s Hospital of Los Angeles (data generation) and Texas Children’s Hospital (data analysis/summarization).

DNA was extracted using the DNA/RNA co-isolation method using the Qiagen AllPrep Kit.

DNA samples were labeled using SNP 6.0 core reagent kit following Affymetrix guidelines. Hybridization of samples was performed following manufacturer's instructions for Affymetrix Genome-Wide Human SNP6, and arrays scanned using Affymetrix Gene ChiP Scanner 3000 7G.

Raw CEL files processed using Genotyping Console Software, and segmentation analysis performed using Partek Genomics Suite genomic segmentation algorithm.

HumanHap 550K Beadchip (Illumina) for Neuroblastoma (NBL)

*Protocol performed at Nationwide Children’s Hospital (extractions) and Children’s Hospital of Philadelphia.

The DNA was extracted at Nationwide Children's Molecular Genetics Laboratory (MGL) using either the Qiagen All-Prep Co-isolation method or the Qiagen Genomic Tips protocol.  QIAGEN Blood & Cell Culture DNA Kits and QIAGEN Genomic-tips with the Genomic DNA Buffer Set, provide an easy, safe and reliable method for the isolation of pure high molecular weight genomic DNA, direct from whole blood, lymphocytes and tissues. The procedure is based on optimized buffer system for lysis of cells and/or nuclei, followed by binding of genomic DNA to QIAGEN Anion Exchange Resin under appropriate low salt and pH conditions. RNA, proteins, dyes and low-molecular-weight impurities are removed by a medium-salt wash. Genomic DNA is eluted in a high-salt buffer and concentrated and desalted by isopropanol precipitation.

Nucleic acid labeling, hybridization array scanning and data normalization protocols were performed according to the Illumina manufacturer's protocol for the Illumina 550K array at the Children's Hospital of Philadelphia.

OverUnder algorithm (see Attiyeh EF et al.). Older L2 data that used reference genome hg18 was remapped to hg19 during analysis so that all L3 copy number segmentation results are using hg19.

Data transformation was done using the OverUnder algorithm1.

References

  1. Attiyeh EF et al. (2009). Genomic copy number determination in cancer cells from single nucleotide polymorphism microarrays based on quantitative genotyping corrected for aneuploidy. Genome Res 19(2), 276-83 (PMID: 19141597)

Human Omni5 BeadChip Kit (Illumina) for Pediatric Preclinical Testing Program (PPTP)

*Protocol performed at Nationwide Children’s Hospital.

Nucleic acid labeling for the Human Omni5 BeadChip was performed to Illumina’s manufacturer's standard protocol, please refer to Illumina Infinium LCG Quad Assay protocols manual.                         

Hybridization Human Omni5 BeadChip was performed to manufacturer's standard protocol, please refer to Illumina Infinium LCG Quad Assay protocols manual                         

Scanning Human Omni5 BeadChip was performed according to manufacturer's standard protocol using use Illumina HiScan instrument with iScan software.

Last updated: November 25, 2016