Whole Exome Sequencing

Sequencing Center Data Generation Protocols Data Analysis Protocols
Baylor College of Medicine ALL P2ALL symbol , AMLAML symbol , NBL MDLSNeuroblastoma symbol , PPTPPPTP symbol , WTKidney tumor symbol ALL P2ALL symbol , AMLAML symbol , NBL MDLSNeuroblastoma symbol , PPTPPPTP symbol , WTKidney tumor symbol
Broad Institute NBLNeuroblastoma symbol NBLNeuroblastoma symbol
St. Jude Children’s Research Hospital (SJCRH) ALALALL symbol ALALALL symbol

Whole Exome Sequencing

*Protocols were performed at Baylor College of Medicine.

Library construction

Specimen processing, DNA extraction, standard QC and Illumina paired-end pre-capture libraries were prepared according to the manufacturer's protocol (Illumina Inc, San Diego, CA) with the following modifications: 0.5 - 1ug genomic DNA in 100ul volume was sheared into fragments of approximately 300 base pairs in a Covaris E210 system (Covaris, Inc. Woburn, MA). The setting was 10% duty cycle, intensity of 4,200 cycles per burst for 120 seconds. Fragment size was checked using a 2.2% Flash Gel DNA Cassette (Lonza, Walkersville, MD, Cat. No.57023). End-repair of fragmented DNA was performed in 90ul total reaction volume containing sheared DNA, 9 ul 10X buffer, 5 ul END Repair Enzyme Mix and H2O (NEBNext End-Repair Module, New England BioLabs, Ipswich, MA, Cat. No. E6050L), incubated at 20°C for 30 minutes. A-tailing was performed in a total reaction volume of 60ul containing end-repaired DNA, 6ul 10X buffer, 3ul Klenow fragment (NEBNext dA-Tailing Module; Cat. No. E6053L) and H2O followed by incubation at 37°C for 30 minutes. Illumina multiplex adapter ligation (NEBNext Quick Ligation Module Cat. No. E6056L) was performed in a total reaction volume of 90ul containing 18ul 5X buffer, 5ul ligase, 0.5ul 100uM adaptor and H2O at room temperature for 30 minutes. After ligation, PCR with Illumina PE 1.0 and modified barcode primers (manuscript in preparation) was performed in 170ul reactions containing 85ul of 2x Phusion High-Fidelity PCR master mix, adaptor ligated DNA, 1.75ul of 50uM primers and H2O. PCR was performed using a 5-minute initial denaturation at 95°C, 6-10 cycles of 15 seconds at 95°C, 15 seconds at 60°C and 30 seconds at 72°C followed by a final extension for 5 minute at 72°C. Agencourt XP Beads (Beckman Coulter Genomics, Inc., Danvers, MA, Cat. No. A63882) were used to purify DNA after each enzymatic reaction. After purification, PCR product quantification and size distribution was determined using the Caliper GX 1K/12K/High Sensitivity Assay Labchip (Hopkinton, MA, Cat. No. 760517).

Exome capture

Illumina pre-capture libraries (1ug DNA input) were hybridized in solution to SeqCap EZ Human Exome 2.0 (Nimblegen, Madison, WI) probes targeting approximately 44Mbs of sequence from approximately 30K genes according to the manufacturer's protocol with the following modifications: hybridization enhancing oligos IHE1, IHE2 and IHE3 replaced oligos HE1.1 and HE2.1 and post-capture LM-PCR was performed using 14 cycles. Capture libraries were quantified using Caliper GX 1K/12K/High Sensitivity Assay Labchip (Hopkinton, MA, Cat. No. 760517). The efficiency of the capture was evaluated by performing a qPCR-based quality check on the built-in controls (qPCR SYBR Green assays, Applied Biosystems, Grand Island, NY). Four standardized oligo sets, RUNX2, PRKG1, SMG1, and NLK, were employed as internal quality controls. The enrichment of the capture libraries was estimated to range from 7- to 9-fold over background.

Library templates were prepared for sequencing using Illumina's cBot cluster generation system with TruSeq PE Cluster Generation Kits (Part no. PE-401-3001). Briefly, these libraries were denatured with sodium hydroxide and diluted to 6-9 pM in hybridization buffer in order to achieve a load density of ~800K clusters/mm2. Each library pool was loaded in a single lane of a HiSeq flow cell, and each lane was spiked with 2% phiX control library for run quality control. The sample libraries then underwent bridge amplification to form clonal clusters, followed by hybridization with the sequencing primer. Sequencing runs were performed in paired-end mode using the Illumina HiSeq 2000 platform. Using the TruSeq SBS Kits (Part no. FC-401-3001), sequencing-by-synthesis reactions were extended for 101 cycles from each end, with an additional 7 cycles for the index read. Sequencing runs generated approximately 300-400 million successful reads on each lane of a flow cell, with approximately 9-10 Gb produced per sample. With these sequencing yields, samples achieved an average of 95% of the targeted exome bases covered to a depth of 20X or greater.

Real Time Analysis (RTA) software was used to process the image analysis and nucleotide base calling. On average, about 80-100 million successful reads, consisting of 2X 100 bp, were generated on each lane of a flow cell.

Whole Exome Sequencing

*Protocols were performed at Baylor College of Medicine.

Mapping Reads

Illumina HiSeq bcl files were processed using BCLConvertor v1.7.1. All reads from the prepared libraries that passed the Illumina Chastity filter were formatted into fastq files. The fastq files were aligned to human reference genome build37 (NCBI) using BWA (bwa-0.5.9-R16) with default parameters with the following exceptions: seed sequence: 40 bpseed mismatch: 2, total mismatches allowed: 3. BAM files generated from alignment were preprocessed using GATK (v1.3-8-gb0e6afe) [1] to recalibrate and locally realign reads.

Mutation Detection

Sequence variants were called from tumor and matched normal BAM files using Atlas [2] an integrative variant analysis suite of tools specializing in the separation of true SNPs and insertions and deletions (indels) from sequencing and mapping errors in whole exome capture sequencing (WXS) data. The suite implements logistic regression models trained on validated WXS data to identify the true variants. ATLAS-SNP-2 (v1.3) [3] and ATLAS-Indel-2 (v0.3.1) along with Pindel (v0.2.4q) [4] were run on the BAM files producing variant data that were further filtered to remove all those observed fewer than 5 times or were present in less than 0.08 of the reads (e.g., variant allele fraction must be greater than 0.08 to undergo validation). At least one variant read of Q30 or better was required, and the variant had to lie in the central portion of the read (15% from the 5' end of the read and 20% from the 3' end). In addition, reads harboring the variant must have been observed in both forward and reverse orientations. Finally, the variant base was not observed in the normal tissue. Indels were discovered by similar processing except indels must have been observed in at least 10 of the reads.

 

Whole Exome Sequencing

*Protocols were performed at Baylor College of Medicine.

Library construction

Specimen processing, DNA extraction, standard QC and Illumina paired-end pre-capture libraries were prepared according to the manufacturer's protocol (Illumina Inc, San Diego, CA) with the following modifications: 0.5 - 1ug genomic DNA in 100ul volume was sheared into fragments of approximately 300 base pairs in a Covaris E210 system (Covaris, Inc. Woburn, MA). The setting was 10% duty cycle, intensity of 4,200 cycles per burst for 120 seconds. Fragment size was checked using a 2.2% Flash Gel DNA Cassette (Lonza, Walkersville, MD, Cat. No.57023). End-repair of fragmented DNA was performed in 90ul total reaction volume containing sheared DNA, 9 ul 10X buffer, 5 ul END Repair Enzyme Mix and H2O (NEBNext End-Repair Module, New England BioLabs, Ipswich, MA, Cat. No. E6050L), incubated at 20°C for 30 minutes. A-tailing was performed in a total reaction volume of 60ul containing end-repaired DNA, 6ul 10X buffer, 3ul Klenow fragment (NEBNext dA-Tailing Module; Cat. No. E6053L) and H2O followed by incubation at 37°C for 30 minutes. Illumina multiplex adapter ligation (NEBNext Quick Ligation Module Cat. No. E6056L) was performed in a total reaction volume of 90ul containing 18ul 5X buffer, 5ul ligase, 0.5ul 100uM adaptor and H2O at room temperature for 30 minutes. After ligation, PCR with Illumina PE 1.0 and modified barcode primers (manuscript in preparation) was performed in 170ul reactions containing 85ul of 2x Phusion High-Fidelity PCR master mix, adaptor ligated DNA, 1.75ul of 50uM primers and H2O. PCR was performed using a 5-minute initial denaturation at 95°C, 6-10 cycles of 15 seconds at 95°C, 15 seconds at 60°C and 30 seconds at 72°C followed by a final extension for 5 minute at 72°C. Agencourt XP Beads (Beckman Coulter Genomics, Inc., Danvers, MA, Cat. No. A63882) were used to purify DNA after each enzymatic reaction. After purification, PCR product quantification and size distribution was determined using the Caliper GX 1K/12K/High Sensitivity Assay Labchip (Hopkinton, MA, Cat. No. 760517).

Exome capture

Illumina pre-capture libraries (1ug DNA input) were hybridized in solution to SeqCap EZ Human Exome 2.0 (Nimblegen, Madison, WI) probes targeting approximately 44Mbs of sequence from approximately 30K genes according to the manufacturer's protocol with the following modifications: hybridization enhancing oligos IHE1, IHE2 and IHE3 replaced oligos HE1.1 and HE2.1 and post-capture LM-PCR was performed using 14 cycles. Capture libraries were quantified using Caliper GX 1K/12K/High Sensitivity Assay Labchip (Hopkinton, MA, Cat. No. 760517). The efficiency of the capture was evaluated by performing a qPCR-based quality check on the built-in controls (qPCR SYBR Green assays, Applied Biosystems, Grand Island, NY). Four standardized oligo sets, RUNX2, PRKG1, SMG1, and NLK, were employed as internal quality controls. The enrichment of the capture libraries was estimated to range from 7- to 9-fold over background.

Library templates were prepared for sequencing using Illumina's cBot cluster generation system with TruSeq PE Cluster Generation Kits (Part no. PE-401-3001). Briefly, these libraries were denatured with sodium hydroxide and diluted to 6-9 pM in hybridization buffer in order to achieve a load density of ~800K clusters/mm2. Each library pool was loaded in a single lane of a HiSeq flow cell, and each lane was spiked with 2% phiX control library for run quality control. The sample libraries then underwent bridge amplification to form clonal clusters, followed by hybridization with the sequencing primer. Sequencing runs were performed in paired-end mode using the Illumina HiSeq 2000 platform. Using the TruSeq SBS Kits (Part no. FC-401-3001), sequencing-by-synthesis reactions were extended for 101 cycles from each end, with an additional 7 cycles for the index read. Sequencing runs generated approximately 300-400 million successful reads on each lane of a flow cell, with approximately 9-10 Gb produced per sample. With these sequencing yields, samples achieved an average of 95% of the targeted exome bases covered to a depth of 20X or greater.

Real Time Analysis (RTA) software was used to process the image analysis and nucleotide base calling. On average, about 80-100 million successful reads, consisting of 2X 100 bp, were generated on each lane of a flow cell.

Whole Exome Sequencing

*Protocols were performed at Baylor College of Medicine.

Mapping Reads

Illumina HiSeq bcl files were processed using BCLConvertor v1.7.1. All reads from the prepared libraries that passed the Illumina Chastity filter were formatted into fastq files. The fastq files were aligned to human reference genome build37 (NCBI) using BWA (bwa-0.5.9-R16) with default parameters with the following exceptions: seed sequence: 40 bpseed mismatch: 2, total mismatches allowed: 3. BAM files generated from alignment were preprocessed using GATK (v1.3-8-gb0e6afe) [1] to recalibrate and locally realign reads.

Mutation Detection

Sequence variants were called from tumor and matched normal BAM files using Atlas [2] an integrative variant analysis suite of tools specializing in the separation of true SNPs and insertions and deletions (indels) from sequencing and mapping errors in whole exome capture sequencing (WXS) data. The suite implements logistic regression models trained on validated WXS data to identify the true variants. ATLAS-SNP-2 (v1.3) [3] and ATLAS-Indel-2 (v0.3.1) along with Pindel (v0.2.4q) [4] were run on the BAM files producing variant data that were further filtered to remove all those observed fewer than 5 times or were present in less than 0.08 of the reads (e.g., variant allele fraction must be greater than 0.08 to undergo validation). At least one variant read of Q30 or better was required, and the variant had to lie in the central portion of the read (15% from the 5' end of the read and 20% from the 3' end). In addition, reads harboring the variant must have been observed in both forward and reverse orientations. Finally, the variant base was not observed in the normal tissue. Indels were discovered by similar processing except indels must have been observed in at least 10 of the reads.

Whole Exome Sequencing

*Protocols were performed at Baylor College of Medicine.

Library construction

Specimen processing, DNA extraction, standard QC and Illumina paired-end pre-capture libraries were prepared according to the manufacturer's protocol (Illumina Inc, San Diego, CA) with the following modifications: 0.5 - 1ug genomic DNA in 100ul volume was sheared into fragments of approximately 300 base pairs in a Covaris E210 system (Covaris, Inc. Woburn, MA). The setting was 10% duty cycle, intensity of 4,200 cycles per burst for 120 seconds. Fragment size was checked using a 2.2% Flash Gel DNA Cassette (Lonza, Walkersville, MD, Cat. No.57023). End-repair of fragmented DNA was performed in 90ul total reaction volume containing sheared DNA, 9 ul 10X buffer, 5 ul END Repair Enzyme Mix and H2O (NEBNext End-Repair Module, New England BioLabs, Ipswich, MA, Cat. No. E6050L), incubated at 20°C for 30 minutes. A-tailing was performed in a total reaction volume of 60ul containing end-repaired DNA, 6ul 10X buffer, 3ul Klenow fragment (NEBNext dA-Tailing Module; Cat. No. E6053L) and H2O followed by incubation at 37°C for 30 minutes. Illumina multiplex adapter ligation (NEBNext Quick Ligation Module Cat. No. E6056L) was performed in a total reaction volume of 90ul containing 18ul 5X buffer, 5ul ligase, 0.5ul 100uM adaptor and H2O at room temperature for 30 minutes. After ligation, PCR with Illumina PE 1.0 and modified barcode primers (manuscript in preparation) was performed in 170ul reactions containing 85ul of 2x Phusion High-Fidelity PCR master mix, adaptor ligated DNA, 1.75ul of 50uM primers and H2O. PCR was performed using a 5-minute initial denaturation at 95°C, 6-10 cycles of 15 seconds at 95°C, 15 seconds at 60°C and 30 seconds at 72°C followed by a final extension for 5 minute at 72°C. Agencourt XP Beads (Beckman Coulter Genomics, Inc., Danvers, MA, Cat. No. A63882) were used to purify DNA after each enzymatic reaction. After purification, PCR product quantification and size distribution was determined using the Caliper GX 1K/12K/High Sensitivity Assay Labchip (Hopkinton, MA, Cat. No. 760517).

Exome capture

Illumina pre-capture libraries (1ug DNA input) were hybridized in solution to SeqCap EZ Human Exome 2.0 (Nimblegen, Madison, WI) probes targeting approximately 44Mbs of sequence from approximately 30K genes according to the manufacturer's protocol with the following modifications: hybridization enhancing oligos IHE1, IHE2 and IHE3 replaced oligos HE1.1 and HE2.1 and post-capture LM-PCR was performed using 14 cycles. Capture libraries were quantified using Caliper GX 1K/12K/High Sensitivity Assay Labchip (Hopkinton, MA, Cat. No. 760517). The efficiency of the capture was evaluated by performing a qPCR-based quality check on the built-in controls (qPCR SYBR Green assays, Applied Biosystems, Grand Island, NY). Four standardized oligo sets, RUNX2, PRKG1, SMG1, and NLK, were employed as internal quality controls. The enrichment of the capture libraries was estimated to range from 7- to 9-fold over background.

Library templates were prepared for sequencing using Illumina's cBot cluster generation system with TruSeq PE Cluster Generation Kits (Part no. PE-401-3001). Briefly, these libraries were denatured with sodium hydroxide and diluted to 6-9 pM in hybridization buffer in order to achieve a load density of ~800K clusters/mm2. Each library pool was loaded in a single lane of a HiSeq flow cell, and each lane was spiked with 2% phiX control library for run quality control. The sample libraries then underwent bridge amplification to form clonal clusters, followed by hybridization with the sequencing primer. Sequencing runs were performed in paired-end mode using the Illumina HiSeq 2000 platform. Using the TruSeq SBS Kits (Part no. FC-401-3001), sequencing-by-synthesis reactions were extended for 101 cycles from each end, with an additional 7 cycles for the index read. Sequencing runs generated approximately 300-400 million successful reads on each lane of a flow cell, with approximately 9-10 Gb produced per sample. With these sequencing yields, samples achieved an average of 95% of the targeted exome bases covered to a depth of 20X or greater.

Real Time Analysis (RTA) software was used to process the image analysis and nucleotide base calling. On average, about 80-100 million successful reads, consisting of 2X 100 bp, were generated on each lane of a flow cell.

Whole Exome Sequencing

*Protocols were performed at Baylor College of Medicine.

Mapping Reads

Illumina HiSeq bcl files were processed using BCLConvertor v1.7.1. All reads from the prepared libraries that passed the Illumina Chastity filter were formatted into fastq files. The fastq files were aligned to human reference genome build37 (NCBI) using BWA (bwa-0.5.9-R16) with default parameters with the following exceptions: seed sequence: 40 bpseed mismatch: 2, total mismatches allowed: 3. BAM files generated from alignment were preprocessed using GATK (v1.3-8-gb0e6afe) [1] to recalibrate and locally realign reads.

Mutation Detection

Sequence variants were called from tumor and matched normal BAM files using Atlas [2] an integrative variant analysis suite of tools specializing in the separation of true SNPs and insertions and deletions (indels) from sequencing and mapping errors in whole exome capture sequencing (WXS) data. The suite implements logistic regression models trained on validated WXS data to identify the true variants. ATLAS-SNP-2 (v1.3) [3] and ATLAS-Indel-2 (v0.3.1) along with Pindel (v0.2.4q) [4] were run on the BAM files producing variant data that were further filtered to remove all those observed fewer than 5 times or were present in less than 0.08 of the reads (e.g., variant allele fraction must be greater than 0.08 to undergo validation). At least one variant read of Q30 or better was required, and the variant had to lie in the central portion of the read (15% from the 5' end of the read and 20% from the 3' end). In addition, reads harboring the variant must have been observed in both forward and reverse orientations. Finally, the variant base was not observed in the normal tissue. Indels were discovered by similar processing except indels must have been observed in at least 10 of the reads.

Whole Exome Sequencing

*Protocols were performed at Baylor College of Medicine.

Library construction

Specimen processing, DNA extraction, standard QC and Illumina paired-end pre-capture libraries were prepared according to the manufacturer's protocol (Illumina Inc, San Diego, CA) with the following modifications: 0.5 - 1ug genomic DNA in 100ul volume was sheared into fragments of approximately 300 base pairs in a Covaris E210 system (Covaris, Inc. Woburn, MA). The setting was 10% duty cycle, intensity of 4,200 cycles per burst for 120 seconds. Fragment size was checked using a 2.2% Flash Gel DNA Cassette (Lonza, Walkersville, MD, Cat. No.57023). End-repair of fragmented DNA was performed in 90ul total reaction volume containing sheared DNA, 9 ul 10X buffer, 5 ul END Repair Enzyme Mix and H2O (NEBNext End-Repair Module, New England BioLabs, Ipswich, MA, Cat. No. E6050L), incubated at 20°C for 30 minutes. A-tailing was performed in a total reaction volume of 60ul containing end-repaired DNA, 6ul 10X buffer, 3ul Klenow fragment (NEBNext dA-Tailing Module; Cat. No. E6053L) and H2O followed by incubation at 37°C for 30 minutes. Illumina multiplex adapter ligation (NEBNext Quick Ligation Module Cat. No. E6056L) was performed in a total reaction volume of 90ul containing 18ul 5X buffer, 5ul ligase, 0.5ul 100uM adaptor and H2O at room temperature for 30 minutes. After ligation, PCR with Illumina PE 1.0 and modified barcode primers (manuscript in preparation) was performed in 170ul reactions containing 85ul of 2x Phusion High-Fidelity PCR master mix, adaptor ligated DNA, 1.75ul of 50uM primers and H2O. PCR was performed using a 5-minute initial denaturation at 95°C, 6-10 cycles of 15 seconds at 95°C, 15 seconds at 60°C and 30 seconds at 72°C followed by a final extension for 5 minute at 72°C. Agencourt XP Beads (Beckman Coulter Genomics, Inc., Danvers, MA, Cat. No. A63882) were used to purify DNA after each enzymatic reaction. After purification, PCR product quantification and size distribution was determined using the Caliper GX 1K/12K/High Sensitivity Assay Labchip (Hopkinton, MA, Cat. No. 760517).

Exome capture

Illumina pre-capture libraries (1ug DNA input) were hybridized in solution to SeqCap EZ Human Exome 2.0 (Nimblegen, Madison, WI) probes targeting approximately 44Mbs of sequence from approximately 30K genes according to the manufacturer's protocol with the following modifications: hybridization enhancing oligos IHE1, IHE2 and IHE3 replaced oligos HE1.1 and HE2.1 and post-capture LM-PCR was performed using 14 cycles. Capture libraries were quantified using Caliper GX 1K/12K/High Sensitivity Assay Labchip (Hopkinton, MA, Cat. No. 760517). The efficiency of the capture was evaluated by performing a qPCR-based quality check on the built-in controls (qPCR SYBR Green assays, Applied Biosystems, Grand Island, NY). Four standardized oligo sets, RUNX2, PRKG1, SMG1, and NLK, were employed as internal quality controls. The enrichment of the capture libraries was estimated to range from 7- to 9-fold over background.

Library templates were prepared for sequencing using Illumina's cBot cluster generation system with TruSeq PE Cluster Generation Kits (Part no. PE-401-3001). Briefly, these libraries were denatured with sodium hydroxide and diluted to 6-9 pM in hybridization buffer in order to achieve a load density of ~800K clusters/mm2. Each library pool was loaded in a single lane of a HiSeq flow cell, and each lane was spiked with 2% phiX control library for run quality control. The sample libraries then underwent bridge amplification to form clonal clusters, followed by hybridization with the sequencing primer. Sequencing runs were performed in paired-end mode using the Illumina HiSeq 2000 platform. Using the TruSeq SBS Kits (Part no. FC-401-3001), sequencing-by-synthesis reactions were extended for 101 cycles from each end, with an additional 7 cycles for the index read. Sequencing runs generated approximately 300-400 million successful reads on each lane of a flow cell, with approximately 9-10 Gb produced per sample. With these sequencing yields, samples achieved an average of 95% of the targeted exome bases covered to a depth of 20X or greater.

Real Time Analysis (RTA) software was used to process the image analysis and nucleotide base calling. On average, about 80-100 million successful reads, consisting of 2X 100 bp, were generated on each lane of a flow cell.

Whole Exome Sequencing

*Protocols were performed at Baylor College of Medicine.

Mapping Reads

Illumina HiSeq bcl files were processed using BCLConvertor v1.7.1. All reads from the prepared libraries that passed the Illumina Chastity filter were formatted into fastq files. The fastq files were aligned to human reference genome build37 (NCBI) using BWA (bwa-0.5.9-R16) with default parameters with the following exceptions: seed sequence: 40 bpseed mismatch: 2, total mismatches allowed: 3. BAM files generated from alignment were preprocessed using GATK (v1.3-8-gb0e6afe) [1] to recalibrate and locally realign reads.

Mutation Detection

Sequence variants were called from tumor and matched normal BAM files using Atlas [2] an integrative variant analysis suite of tools specializing in the separation of true SNPs and insertions and deletions (indels) from sequencing and mapping errors in whole exome capture sequencing (WXS) data. The suite implements logistic regression models trained on validated WXS data to identify the true variants. ATLAS-SNP-2 (v1.3) [3] and ATLAS-Indel-2 (v0.3.1) along with Pindel (v0.2.4q) [4] were run on the BAM files producing variant data that were further filtered to remove all those observed fewer than 5 times or were present in less than 0.08 of the reads (e.g., variant allele fraction must be greater than 0.08 to undergo validation). At least one variant read of Q30 or better was required, and the variant had to lie in the central portion of the read (15% from the 5' end of the read and 20% from the 3' end). In addition, reads harboring the variant must have been observed in both forward and reverse orientations. Finally, the variant base was not observed in the normal tissue. Indels were discovered by similar processing except indels must have been observed in at least 10 of the reads.

Whole Exome Sequencing

*Protocols were performed at Baylor College of Medicine.

Library construction

Specimen processing, DNA extraction, standard QC and Illumina paired-end pre-capture libraries were prepared according to the manufacturer's protocol (Illumina Inc, San Diego, CA) with the following modifications: 0.5 - 1ug genomic DNA in 100ul volume was sheared into fragments of approximately 300 base pairs in a Covaris E210 system (Covaris, Inc. Woburn, MA). The setting was 10% duty cycle, intensity of 4,200 cycles per burst for 120 seconds. Fragment size was checked using a 2.2% Flash Gel DNA Cassette (Lonza, Walkersville, MD, Cat. No.57023). End-repair of fragmented DNA was performed in 90ul total reaction volume containing sheared DNA, 9 ul 10X buffer, 5 ul END Repair Enzyme Mix and H2O (NEBNext End-Repair Module, New England BioLabs, Ipswich, MA, Cat. No. E6050L), incubated at 20°C for 30 minutes. A-tailing was performed in a total reaction volume of 60ul containing end-repaired DNA, 6ul 10X buffer, 3ul Klenow fragment (NEBNext dA-Tailing Module; Cat. No. E6053L) and H2O followed by incubation at 37°C for 30 minutes. Illumina multiplex adapter ligation (NEBNext Quick Ligation Module Cat. No. E6056L) was performed in a total reaction volume of 90ul containing 18ul 5X buffer, 5ul ligase, 0.5ul 100uM adaptor and H2O at room temperature for 30 minutes. After ligation, PCR with Illumina PE 1.0 and modified barcode primers (manuscript in preparation) was performed in 170ul reactions containing 85ul of 2x Phusion High-Fidelity PCR master mix, adaptor ligated DNA, 1.75ul of 50uM primers and H2O. PCR was performed using a 5-minute initial denaturation at 95°C, 6-10 cycles of 15 seconds at 95°C, 15 seconds at 60°C and 30 seconds at 72°C followed by a final extension for 5 minute at 72°C. Agencourt XP Beads (Beckman Coulter Genomics, Inc., Danvers, MA, Cat. No. A63882) were used to purify DNA after each enzymatic reaction. After purification, PCR product quantification and size distribution was determined using the Caliper GX 1K/12K/High Sensitivity Assay Labchip (Hopkinton, MA, Cat. No. 760517).

Exome capture

Illumina pre-capture libraries (1ug DNA input) were hybridized in solution to SeqCap EZ Human Exome 2.0 (Nimblegen, Madison, WI) probes targeting approximately 44Mbs of sequence from approximately 30K genes according to the manufacturer's protocol with the following modifications: hybridization enhancing oligos IHE1, IHE2 and IHE3 replaced oligos HE1.1 and HE2.1 and post-capture LM-PCR was performed using 14 cycles. Capture libraries were quantified using Caliper GX 1K/12K/High Sensitivity Assay Labchip (Hopkinton, MA, Cat. No. 760517). The efficiency of the capture was evaluated by performing a qPCR-based quality check on the built-in controls (qPCR SYBR Green assays, Applied Biosystems, Grand Island, NY). Four standardized oligo sets, RUNX2, PRKG1, SMG1, and NLK, were employed as internal quality controls. The enrichment of the capture libraries was estimated to range from 7- to 9-fold over background.

Library templates were prepared for sequencing using Illumina's cBot cluster generation system with TruSeq PE Cluster Generation Kits (Part no. PE-401-3001). Briefly, these libraries were denatured with sodium hydroxide and diluted to 6-9 pM in hybridization buffer in order to achieve a load density of ~800K clusters/mm2. Each library pool was loaded in a single lane of a HiSeq flow cell, and each lane was spiked with 2% phiX control library for run quality control. The sample libraries then underwent bridge amplification to form clonal clusters, followed by hybridization with the sequencing primer. Sequencing runs were performed in paired-end mode using the Illumina HiSeq 2000 platform. Using the TruSeq SBS Kits (Part no. FC-401-3001), sequencing-by-synthesis reactions were extended for 101 cycles from each end, with an additional 7 cycles for the index read. Sequencing runs generated approximately 300-400 million successful reads on each lane of a flow cell, with approximately 9-10 Gb produced per sample. With these sequencing yields, samples achieved an average of 95% of the targeted exome bases covered to a depth of 20X or greater.

Real Time Analysis (RTA) software was used to process the image analysis and nucleotide base calling. On average, about 80-100 million successful reads, consisting of 2X 100 bp, were generated on each lane of a flow cell.

Whole Exome Sequencing

*Protocols were performed at Baylor College of Medicine.

Mapping Reads

Illumina HiSeq bcl files were processed using BCLConvertor v1.7.1. All reads from the prepared libraries that passed the Illumina Chastity filter were formatted into fastq files. The fastq files were aligned to human reference genome build37 (NCBI) using BWA (bwa-0.5.9-R16) with default parameters with the following exceptions: seed sequence: 40 bpseed mismatch: 2, total mismatches allowed: 3. BAM files generated from alignment were preprocessed using GATK (v1.3-8-gb0e6afe) [1] to recalibrate and locally realign reads.

Mutation Detection

Sequence variants were called from tumor and matched normal BAM files using Atlas [2] an integrative variant analysis suite of tools specializing in the separation of true SNPs and insertions and deletions (indels) from sequencing and mapping errors in whole exome capture sequencing (WXS) data. The suite implements logistic regression models trained on validated WXS data to identify the true variants. ATLAS-SNP-2 (v1.3) [3] and ATLAS-Indel-2 (v0.3.1) along with Pindel (v0.2.4q) [4] were run on the BAM files producing variant data that were further filtered to remove all those observed fewer than 5 times or were present in less than 0.08 of the reads (e.g., variant allele fraction must be greater than 0.08 to undergo validation). At least one variant read of Q30 or better was required, and the variant had to lie in the central portion of the read (15% from the 5' end of the read and 20% from the 3' end). In addition, reads harboring the variant must have been observed in both forward and reverse orientations. Finally, the variant base was not observed in the normal tissue. Indels were discovered by similar processing except indels must have been observed in at least 10 of the reads.

Whole Exome Sequencing

*Protocols were performed at Baylor College of Medicine.

Library construction

Specimen processing, DNA extraction, standard QC and Illumina paired-end pre-capture libraries were prepared according to the manufacturer's protocol (Illumina Inc, San Diego, CA) with the following modifications: 0.5 - 1ug genomic DNA in 100ul volume was sheared into fragments of approximately 300 base pairs in a Covaris E210 system (Covaris, Inc. Woburn, MA). The setting was 10% duty cycle, intensity of 4,200 cycles per burst for 120 seconds. Fragment size was checked using a 2.2% Flash Gel DNA Cassette (Lonza, Walkersville, MD, Cat. No.57023). End-repair of fragmented DNA was performed in 90ul total reaction volume containing sheared DNA, 9 ul 10X buffer, 5 ul END Repair Enzyme Mix and H2O (NEBNext End-Repair Module, New England BioLabs, Ipswich, MA, Cat. No. E6050L), incubated at 20°C for 30 minutes. A-tailing was performed in a total reaction volume of 60ul containing end-repaired DNA, 6ul 10X buffer, 3ul Klenow fragment (NEBNext dA-Tailing Module; Cat. No. E6053L) and H2O followed by incubation at 37°C for 30 minutes. Illumina multiplex adapter ligation (NEBNext Quick Ligation Module Cat. No. E6056L) was performed in a total reaction volume of 90ul containing 18ul 5X buffer, 5ul ligase, 0.5ul 100uM adaptor and H2O at room temperature for 30 minutes. After ligation, PCR with Illumina PE 1.0 and modified barcode primers (manuscript in preparation) was performed in 170ul reactions containing 85ul of 2x Phusion High-Fidelity PCR master mix, adaptor ligated DNA, 1.75ul of 50uM primers and H2O. PCR was performed using a 5-minute initial denaturation at 95°C, 6-10 cycles of 15 seconds at 95°C, 15 seconds at 60°C and 30 seconds at 72°C followed by a final extension for 5 minute at 72°C. Agencourt XP Beads (Beckman Coulter Genomics, Inc., Danvers, MA, Cat. No. A63882) were used to purify DNA after each enzymatic reaction. After purification, PCR product quantification and size distribution was determined using the Caliper GX 1K/12K/High Sensitivity Assay Labchip (Hopkinton, MA, Cat. No. 760517).

Exome capture

Illumina pre-capture libraries (1ug DNA input) were hybridized in solution to SeqCap EZ Human Exome 2.0 (Nimblegen, Madison, WI) probes targeting approximately 44Mbs of sequence from approximately 30K genes according to the manufacturer's protocol with the following modifications: hybridization enhancing oligos IHE1, IHE2 and IHE3 replaced oligos HE1.1 and HE2.1 and post-capture LM-PCR was performed using 14 cycles. Capture libraries were quantified using Caliper GX 1K/12K/High Sensitivity Assay Labchip (Hopkinton, MA, Cat. No. 760517). The efficiency of the capture was evaluated by performing a qPCR-based quality check on the built-in controls (qPCR SYBR Green assays, Applied Biosystems, Grand Island, NY). Four standardized oligo sets, RUNX2, PRKG1, SMG1, and NLK, were employed as internal quality controls. The enrichment of the capture libraries was estimated to range from 7- to 9-fold over background.

Library templates were prepared for sequencing using Illumina's cBot cluster generation system with TruSeq PE Cluster Generation Kits (Part no. PE-401-3001). Briefly, these libraries were denatured with sodium hydroxide and diluted to 6-9 pM in hybridization buffer in order to achieve a load density of ~800K clusters/mm2. Each library pool was loaded in a single lane of a HiSeq flow cell, and each lane was spiked with 2% phiX control library for run quality control. The sample libraries then underwent bridge amplification to form clonal clusters, followed by hybridization with the sequencing primer. Sequencing runs were performed in paired-end mode using the Illumina HiSeq 2000 platform. Using the TruSeq SBS Kits (Part no. FC-401-3001), sequencing-by-synthesis reactions were extended for 101 cycles from each end, with an additional 7 cycles for the index read. Sequencing runs generated approximately 300-400 million successful reads on each lane of a flow cell, with approximately 9-10 Gb produced per sample. With these sequencing yields, samples achieved an average of 95% of the targeted exome bases covered to a depth of 20X or greater.

Real Time Analysis (RTA) software was used to process the image analysis and nucleotide base calling. On average, about 80-100 million successful reads, consisting of 2X 100 bp, were generated on each lane of a flow cell.

Whole Exome Sequencing

*Protocols were performed at Baylor College of Medicine.

Mapping Reads

Illumina HiSeq bcl files were processed using BCLConvertor v1.7.1. All reads from the prepared libraries that passed the Illumina Chastity filter were formatted into fastq files. The fastq files were aligned to human reference genome build37 (NCBI) using BWA (bwa-0.5.9-R16) with default parameters with the following exceptions: seed sequence: 40 bpseed mismatch: 2, total mismatches allowed: 3. BAM files generated from alignment were preprocessed using GATK (v1.3-8-gb0e6afe) [1] to recalibrate and locally realign reads.

Mutation Detection

Sequence variants were called from tumor and matched normal BAM files using Atlas [2] an integrative variant analysis suite of tools specializing in the separation of true SNPs and insertions and deletions (indels) from sequencing and mapping errors in whole exome capture sequencing (WXS) data. The suite implements logistic regression models trained on validated WXS data to identify the true variants. ATLAS-SNP-2 (v1.3) [3] and ATLAS-Indel-2 (v0.3.1) along with Pindel (v0.2.4q) [4] were run on the BAM files producing variant data that were further filtered to remove all those observed fewer than 5 times or were present in less than 0.08 of the reads (e.g., variant allele fraction must be greater than 0.08 to undergo validation). At least one variant read of Q30 or better was required, and the variant had to lie in the central portion of the read (15% from the 5' end of the read and 20% from the 3' end). In addition, reads harboring the variant must have been observed in both forward and reverse orientations. Finally, the variant base was not observed in the normal tissue. Indels were discovered by similar processing except indels must have been observed in at least 10 of the reads.

 

Whole Exome Sequencing

*Protocols were performed at Baylor College of Medicine.

Library construction

Specimen processing, DNA extraction, standard QC and Illumina paired-end pre-capture libraries were prepared according to the manufacturer's protocol (Illumina Inc, San Diego, CA) with the following modifications: 0.5 - 1ug genomic DNA in 100ul volume was sheared into fragments of approximately 300 base pairs in a Covaris E210 system (Covaris, Inc. Woburn, MA). The setting was 10% duty cycle, intensity of 4,200 cycles per burst for 120 seconds. Fragment size was checked using a 2.2% Flash Gel DNA Cassette (Lonza, Walkersville, MD, Cat. No.57023). End-repair of fragmented DNA was performed in 90ul total reaction volume containing sheared DNA, 9 ul 10X buffer, 5 ul END Repair Enzyme Mix and H2O (NEBNext End-Repair Module, New England BioLabs, Ipswich, MA, Cat. No. E6050L), incubated at 20°C for 30 minutes. A-tailing was performed in a total reaction volume of 60ul containing end-repaired DNA, 6ul 10X buffer, 3ul Klenow fragment (NEBNext dA-Tailing Module; Cat. No. E6053L) and H2O followed by incubation at 37°C for 30 minutes. Illumina multiplex adapter ligation (NEBNext Quick Ligation Module Cat. No. E6056L) was performed in a total reaction volume of 90ul containing 18ul 5X buffer, 5ul ligase, 0.5ul 100uM adaptor and H2O at room temperature for 30 minutes. After ligation, PCR with Illumina PE 1.0 and modified barcode primers (manuscript in preparation) was performed in 170ul reactions containing 85ul of 2x Phusion High-Fidelity PCR master mix, adaptor ligated DNA, 1.75ul of 50uM primers and H2O. PCR was performed using a 5-minute initial denaturation at 95°C, 6-10 cycles of 15 seconds at 95°C, 15 seconds at 60°C and 30 seconds at 72°C followed by a final extension for 5 minute at 72°C. Agencourt XP Beads (Beckman Coulter Genomics, Inc., Danvers, MA, Cat. No. A63882) were used to purify DNA after each enzymatic reaction. After purification, PCR product quantification and size distribution was determined using the Caliper GX 1K/12K/High Sensitivity Assay Labchip (Hopkinton, MA, Cat. No. 760517).

Exome capture

Illumina pre-capture libraries (1ug DNA input) were hybridized in solution to SeqCap EZ Human Exome 2.0 (Nimblegen, Madison, WI) probes targeting approximately 44Mbs of sequence from approximately 30K genes according to the manufacturer's protocol with the following modifications: hybridization enhancing oligos IHE1, IHE2 and IHE3 replaced oligos HE1.1 and HE2.1 and post-capture LM-PCR was performed using 14 cycles. Capture libraries were quantified using Caliper GX 1K/12K/High Sensitivity Assay Labchip (Hopkinton, MA, Cat. No. 760517). The efficiency of the capture was evaluated by performing a qPCR-based quality check on the built-in controls (qPCR SYBR Green assays, Applied Biosystems, Grand Island, NY). Four standardized oligo sets, RUNX2, PRKG1, SMG1, and NLK, were employed as internal quality controls. The enrichment of the capture libraries was estimated to range from 7- to 9-fold over background.

Library templates were prepared for sequencing using Illumina's cBot cluster generation system with TruSeq PE Cluster Generation Kits (Part no. PE-401-3001). Briefly, these libraries were denatured with sodium hydroxide and diluted to 6-9 pM in hybridization buffer in order to achieve a load density of ~800K clusters/mm2. Each library pool was loaded in a single lane of a HiSeq flow cell, and each lane was spiked with 2% phiX control library for run quality control. The sample libraries then underwent bridge amplification to form clonal clusters, followed by hybridization with the sequencing primer. Sequencing runs were performed in paired-end mode using the Illumina HiSeq 2000 platform. Using the TruSeq SBS Kits (Part no. FC-401-3001), sequencing-by-synthesis reactions were extended for 101 cycles from each end, with an additional 7 cycles for the index read. Sequencing runs generated approximately 300-400 million successful reads on each lane of a flow cell, with approximately 9-10 Gb produced per sample. With these sequencing yields, samples achieved an average of 95% of the targeted exome bases covered to a depth of 20X or greater.

Real Time Analysis (RTA) software was used to process the image analysis and nucleotide base calling. On average, about 80-100 million successful reads, consisting of 2X 100 bp, were generated on each lane of a flow cell.

Whole Exome Sequencing

*Protocols were performed at Baylor College of Medicine.

Mapping Reads

Illumina HiSeq bcl files were processed using BCLConvertor v1.7.1. All reads from the prepared libraries that passed the Illumina Chastity filter were formatted into fastq files. The fastq files were aligned to human reference genome build37 (NCBI) using BWA (bwa-0.5.9-R16) with default parameters with the following exceptions: seed sequence: 40 bpseed mismatch: 2, total mismatches allowed: 3. BAM files generated from alignment were preprocessed using GATK (v1.3-8-gb0e6afe) [1] to recalibrate and locally realign reads.

Mutation Detection

Sequence variants were called from tumor and matched normal BAM files using Atlas [2] an integrative variant analysis suite of tools specializing in the separation of true SNPs and insertions and deletions (indels) from sequencing and mapping errors in whole exome capture sequencing (WXS) data. The suite implements logistic regression models trained on validated WXS data to identify the true variants. ATLAS-SNP-2 (v1.3) [3] and ATLAS-Indel-2 (v0.3.1) along with Pindel (v0.2.4q) [4] were run on the BAM files producing variant data that were further filtered to remove all those observed fewer than 5 times or were present in less than 0.08 of the reads (e.g., variant allele fraction must be greater than 0.08 to undergo validation). At least one variant read of Q30 or better was required, and the variant had to lie in the central portion of the read (15% from the 5' end of the read and 20% from the 3' end). In addition, reads harboring the variant must have been observed in both forward and reverse orientations. Finally, the variant base was not observed in the normal tissue. Indels were discovered by similar processing except indels must have been observed in at least 10 of the reads.

Whole Exome Sequencing

*Protocols were performed at Baylor College of Medicine.

Library construction

Specimen processing, DNA extraction, standard QC and Illumina paired-end pre-capture libraries were prepared according to the manufacturer's protocol (Illumina Inc, San Diego, CA) with the following modifications: 0.5 - 1ug genomic DNA in 100ul volume was sheared into fragments of approximately 300 base pairs in a Covaris E210 system (Covaris, Inc. Woburn, MA). The setting was 10% duty cycle, intensity of 4,200 cycles per burst for 120 seconds. Fragment size was checked using a 2.2% Flash Gel DNA Cassette (Lonza, Walkersville, MD, Cat. No.57023). End-repair of fragmented DNA was performed in 90ul total reaction volume containing sheared DNA, 9 ul 10X buffer, 5 ul END Repair Enzyme Mix and H2O (NEBNext End-Repair Module, New England BioLabs, Ipswich, MA, Cat. No. E6050L), incubated at 20°C for 30 minutes. A-tailing was performed in a total reaction volume of 60ul containing end-repaired DNA, 6ul 10X buffer, 3ul Klenow fragment (NEBNext dA-Tailing Module; Cat. No. E6053L) and H2O followed by incubation at 37°C for 30 minutes. Illumina multiplex adapter ligation (NEBNext Quick Ligation Module Cat. No. E6056L) was performed in a total reaction volume of 90ul containing 18ul 5X buffer, 5ul ligase, 0.5ul 100uM adaptor and H2O at room temperature for 30 minutes. After ligation, PCR with Illumina PE 1.0 and modified barcode primers (manuscript in preparation) was performed in 170ul reactions containing 85ul of 2x Phusion High-Fidelity PCR master mix, adaptor ligated DNA, 1.75ul of 50uM primers and H2O. PCR was performed using a 5-minute initial denaturation at 95°C, 6-10 cycles of 15 seconds at 95°C, 15 seconds at 60°C and 30 seconds at 72°C followed by a final extension for 5 minute at 72°C. Agencourt XP Beads (Beckman Coulter Genomics, Inc., Danvers, MA, Cat. No. A63882) were used to purify DNA after each enzymatic reaction. After purification, PCR product quantification and size distribution was determined using the Caliper GX 1K/12K/High Sensitivity Assay Labchip (Hopkinton, MA, Cat. No. 760517).

Exome capture

Illumina pre-capture libraries (1ug DNA input) were hybridized in solution to SeqCap EZ Human Exome 2.0 (Nimblegen, Madison, WI) probes targeting approximately 44Mbs of sequence from approximately 30K genes according to the manufacturer's protocol with the following modifications: hybridization enhancing oligos IHE1, IHE2 and IHE3 replaced oligos HE1.1 and HE2.1 and post-capture LM-PCR was performed using 14 cycles. Capture libraries were quantified using Caliper GX 1K/12K/High Sensitivity Assay Labchip (Hopkinton, MA, Cat. No. 760517). The efficiency of the capture was evaluated by performing a qPCR-based quality check on the built-in controls (qPCR SYBR Green assays, Applied Biosystems, Grand Island, NY). Four standardized oligo sets, RUNX2, PRKG1, SMG1, and NLK, were employed as internal quality controls. The enrichment of the capture libraries was estimated to range from 7- to 9-fold over background.

Library templates were prepared for sequencing using Illumina's cBot cluster generation system with TruSeq PE Cluster Generation Kits (Part no. PE-401-3001). Briefly, these libraries were denatured with sodium hydroxide and diluted to 6-9 pM in hybridization buffer in order to achieve a load density of ~800K clusters/mm2. Each library pool was loaded in a single lane of a HiSeq flow cell, and each lane was spiked with 2% phiX control library for run quality control. The sample libraries then underwent bridge amplification to form clonal clusters, followed by hybridization with the sequencing primer. Sequencing runs were performed in paired-end mode using the Illumina HiSeq 2000 platform. Using the TruSeq SBS Kits (Part no. FC-401-3001), sequencing-by-synthesis reactions were extended for 101 cycles from each end, with an additional 7 cycles for the index read. Sequencing runs generated approximately 300-400 million successful reads on each lane of a flow cell, with approximately 9-10 Gb produced per sample. With these sequencing yields, samples achieved an average of 95% of the targeted exome bases covered to a depth of 20X or greater.

Real Time Analysis (RTA) software was used to process the image analysis and nucleotide base calling. On average, about 80-100 million successful reads, consisting of 2X 100 bp, were generated on each lane of a flow cell.

Whole Exome Sequencing

*Protocols were performed at Baylor College of Medicine.

Mapping Reads

Illumina HiSeq bcl files were processed using BCLConvertor v1.7.1. All reads from the prepared libraries that passed the Illumina Chastity filter were formatted into fastq files. The fastq files were aligned to human reference genome build37 (NCBI) using BWA (bwa-0.5.9-R16) with default parameters with the following exceptions: seed sequence: 40 bpseed mismatch: 2, total mismatches allowed: 3. BAM files generated from alignment were preprocessed using GATK (v1.3-8-gb0e6afe) [1] to recalibrate and locally realign reads.

Mutation Detection

Sequence variants were called from tumor and matched normal BAM files using Atlas [2] an integrative variant analysis suite of tools specializing in the separation of true SNPs and insertions and deletions (indels) from sequencing and mapping errors in whole exome capture sequencing (WXS) data. The suite implements logistic regression models trained on validated WXS data to identify the true variants. ATLAS-SNP-2 (v1.3) [3] and ATLAS-Indel-2 (v0.3.1) along with Pindel (v0.2.4q) [4] were run on the BAM files producing variant data that were further filtered to remove all those observed fewer than 5 times or were present in less than 0.08 of the reads (e.g., variant allele fraction must be greater than 0.08 to undergo validation). At least one variant read of Q30 or better was required, and the variant had to lie in the central portion of the read (15% from the 5' end of the read and 20% from the 3' end). In addition, reads harboring the variant must have been observed in both forward and reverse orientations. Finally, the variant base was not observed in the normal tissue. Indels were discovered by similar processing except indels must have been observed in at least 10 of the reads.

Whole Exome Sequencing

*Protocols were performed at Baylor College of Medicine.

Library construction

Specimen processing, DNA extraction, standard QC and Illumina paired-end pre-capture libraries were prepared according to the manufacturer's protocol (Illumina Inc, San Diego, CA) with the following modifications: 0.5 - 1ug genomic DNA in 100ul volume was sheared into fragments of approximately 300 base pairs in a Covaris E210 system (Covaris, Inc. Woburn, MA). The setting was 10% duty cycle, intensity of 4,200 cycles per burst for 120 seconds. Fragment size was checked using a 2.2% Flash Gel DNA Cassette (Lonza, Walkersville, MD, Cat. No.57023). End-repair of fragmented DNA was performed in 90ul total reaction volume containing sheared DNA, 9 ul 10X buffer, 5 ul END Repair Enzyme Mix and H2O (NEBNext End-Repair Module, New England BioLabs, Ipswich, MA, Cat. No. E6050L), incubated at 20°C for 30 minutes. A-tailing was performed in a total reaction volume of 60ul containing end-repaired DNA, 6ul 10X buffer, 3ul Klenow fragment (NEBNext dA-Tailing Module; Cat. No. E6053L) and H2O followed by incubation at 37°C for 30 minutes. Illumina multiplex adapter ligation (NEBNext Quick Ligation Module Cat. No. E6056L) was performed in a total reaction volume of 90ul containing 18ul 5X buffer, 5ul ligase, 0.5ul 100uM adaptor and H2O at room temperature for 30 minutes. After ligation, PCR with Illumina PE 1.0 and modified barcode primers (manuscript in preparation) was performed in 170ul reactions containing 85ul of 2x Phusion High-Fidelity PCR master mix, adaptor ligated DNA, 1.75ul of 50uM primers and H2O. PCR was performed using a 5-minute initial denaturation at 95°C, 6-10 cycles of 15 seconds at 95°C, 15 seconds at 60°C and 30 seconds at 72°C followed by a final extension for 5 minute at 72°C. Agencourt XP Beads (Beckman Coulter Genomics, Inc., Danvers, MA, Cat. No. A63882) were used to purify DNA after each enzymatic reaction. After purification, PCR product quantification and size distribution was determined using the Caliper GX 1K/12K/High Sensitivity Assay Labchip (Hopkinton, MA, Cat. No. 760517).

Exome capture

Illumina pre-capture libraries (1ug DNA input) were hybridized in solution to SeqCap EZ Human Exome 2.0 (Nimblegen, Madison, WI) probes targeting approximately 44Mbs of sequence from approximately 30K genes according to the manufacturer's protocol with the following modifications: hybridization enhancing oligos IHE1, IHE2 and IHE3 replaced oligos HE1.1 and HE2.1 and post-capture LM-PCR was performed using 14 cycles. Capture libraries were quantified using Caliper GX 1K/12K/High Sensitivity Assay Labchip (Hopkinton, MA, Cat. No. 760517). The efficiency of the capture was evaluated by performing a qPCR-based quality check on the built-in controls (qPCR SYBR Green assays, Applied Biosystems, Grand Island, NY). Four standardized oligo sets, RUNX2, PRKG1, SMG1, and NLK, were employed as internal quality controls. The enrichment of the capture libraries was estimated to range from 7- to 9-fold over background.

Library templates were prepared for sequencing using Illumina's cBot cluster generation system with TruSeq PE Cluster Generation Kits (Part no. PE-401-3001). Briefly, these libraries were denatured with sodium hydroxide and diluted to 6-9 pM in hybridization buffer in order to achieve a load density of ~800K clusters/mm2. Each library pool was loaded in a single lane of a HiSeq flow cell, and each lane was spiked with 2% phiX control library for run quality control. The sample libraries then underwent bridge amplification to form clonal clusters, followed by hybridization with the sequencing primer. Sequencing runs were performed in paired-end mode using the Illumina HiSeq 2000 platform. Using the TruSeq SBS Kits (Part no. FC-401-3001), sequencing-by-synthesis reactions were extended for 101 cycles from each end, with an additional 7 cycles for the index read. Sequencing runs generated approximately 300-400 million successful reads on each lane of a flow cell, with approximately 9-10 Gb produced per sample. With these sequencing yields, samples achieved an average of 95% of the targeted exome bases covered to a depth of 20X or greater.

Real Time Analysis (RTA) software was used to process the image analysis and nucleotide base calling. On average, about 80-100 million successful reads, consisting of 2X 100 bp, were generated on each lane of a flow cell.

Whole Exome Sequencing

*Protocols were performed at Baylor College of Medicine.

Mapping Reads

Illumina HiSeq bcl files were processed using BCLConvertor v1.7.1. All reads from the prepared libraries that passed the Illumina Chastity filter were formatted into fastq files. The fastq files were aligned to human reference genome build37 (NCBI) using BWA (bwa-0.5.9-R16) with default parameters with the following exceptions: seed sequence: 40 bpseed mismatch: 2, total mismatches allowed: 3. BAM files generated from alignment were preprocessed using GATK (v1.3-8-gb0e6afe) [1] to recalibrate and locally realign reads.

Mutation Detection

Sequence variants were called from tumor and matched normal BAM files using Atlas [2] an integrative variant analysis suite of tools specializing in the separation of true SNPs and insertions and deletions (indels) from sequencing and mapping errors in whole exome capture sequencing (WXS) data. The suite implements logistic regression models trained on validated WXS data to identify the true variants. ATLAS-SNP-2 (v1.3) [3] and ATLAS-Indel-2 (v0.3.1) along with Pindel (v0.2.4q) [4] were run on the BAM files producing variant data that were further filtered to remove all those observed fewer than 5 times or were present in less than 0.08 of the reads (e.g., variant allele fraction must be greater than 0.08 to undergo validation). At least one variant read of Q30 or better was required, and the variant had to lie in the central portion of the read (15% from the 5' end of the read and 20% from the 3' end). In addition, reads harboring the variant must have been observed in both forward and reverse orientations. Finally, the variant base was not observed in the normal tissue. Indels were discovered by similar processing except indels must have been observed in at least 10 of the reads.

Whole Exome Sequencing

*Protocols were performed at Baylor College of Medicine.

Library construction

Specimen processing, DNA extraction, standard QC and Illumina paired-end pre-capture libraries were prepared according to the manufacturer's protocol (Illumina Inc, San Diego, CA) with the following modifications: 0.5 - 1ug genomic DNA in 100ul volume was sheared into fragments of approximately 300 base pairs in a Covaris E210 system (Covaris, Inc. Woburn, MA). The setting was 10% duty cycle, intensity of 4,200 cycles per burst for 120 seconds. Fragment size was checked using a 2.2% Flash Gel DNA Cassette (Lonza, Walkersville, MD, Cat. No.57023). End-repair of fragmented DNA was performed in 90ul total reaction volume containing sheared DNA, 9 ul 10X buffer, 5 ul END Repair Enzyme Mix and H2O (NEBNext End-Repair Module, New England BioLabs, Ipswich, MA, Cat. No. E6050L), incubated at 20°C for 30 minutes. A-tailing was performed in a total reaction volume of 60ul containing end-repaired DNA, 6ul 10X buffer, 3ul Klenow fragment (NEBNext dA-Tailing Module; Cat. No. E6053L) and H2O followed by incubation at 37°C for 30 minutes. Illumina multiplex adapter ligation (NEBNext Quick Ligation Module Cat. No. E6056L) was performed in a total reaction volume of 90ul containing 18ul 5X buffer, 5ul ligase, 0.5ul 100uM adaptor and H2O at room temperature for 30 minutes. After ligation, PCR with Illumina PE 1.0 and modified barcode primers (manuscript in preparation) was performed in 170ul reactions containing 85ul of 2x Phusion High-Fidelity PCR master mix, adaptor ligated DNA, 1.75ul of 50uM primers and H2O. PCR was performed using a 5-minute initial denaturation at 95°C, 6-10 cycles of 15 seconds at 95°C, 15 seconds at 60°C and 30 seconds at 72°C followed by a final extension for 5 minute at 72°C. Agencourt XP Beads (Beckman Coulter Genomics, Inc., Danvers, MA, Cat. No. A63882) were used to purify DNA after each enzymatic reaction. After purification, PCR product quantification and size distribution was determined using the Caliper GX 1K/12K/High Sensitivity Assay Labchip (Hopkinton, MA, Cat. No. 760517).

Exome capture

Illumina pre-capture libraries (1ug DNA input) were hybridized in solution to SeqCap EZ Human Exome 2.0 (Nimblegen, Madison, WI) probes targeting approximately 44Mbs of sequence from approximately 30K genes according to the manufacturer's protocol with the following modifications: hybridization enhancing oligos IHE1, IHE2 and IHE3 replaced oligos HE1.1 and HE2.1 and post-capture LM-PCR was performed using 14 cycles. Capture libraries were quantified using Caliper GX 1K/12K/High Sensitivity Assay Labchip (Hopkinton, MA, Cat. No. 760517). The efficiency of the capture was evaluated by performing a qPCR-based quality check on the built-in controls (qPCR SYBR Green assays, Applied Biosystems, Grand Island, NY). Four standardized oligo sets, RUNX2, PRKG1, SMG1, and NLK, were employed as internal quality controls. The enrichment of the capture libraries was estimated to range from 7- to 9-fold over background.

Library templates were prepared for sequencing using Illumina's cBot cluster generation system with TruSeq PE Cluster Generation Kits (Part no. PE-401-3001). Briefly, these libraries were denatured with sodium hydroxide and diluted to 6-9 pM in hybridization buffer in order to achieve a load density of ~800K clusters/mm2. Each library pool was loaded in a single lane of a HiSeq flow cell, and each lane was spiked with 2% phiX control library for run quality control. The sample libraries then underwent bridge amplification to form clonal clusters, followed by hybridization with the sequencing primer. Sequencing runs were performed in paired-end mode using the Illumina HiSeq 2000 platform. Using the TruSeq SBS Kits (Part no. FC-401-3001), sequencing-by-synthesis reactions were extended for 101 cycles from each end, with an additional 7 cycles for the index read. Sequencing runs generated approximately 300-400 million successful reads on each lane of a flow cell, with approximately 9-10 Gb produced per sample. With these sequencing yields, samples achieved an average of 95% of the targeted exome bases covered to a depth of 20X or greater.

Real Time Analysis (RTA) software was used to process the image analysis and nucleotide base calling. On average, about 80-100 million successful reads, consisting of 2X 100 bp, were generated on each lane of a flow cell.

Whole Exome Sequencing

*Protocols were performed at Baylor College of Medicine.

Mapping Reads

Illumina HiSeq bcl files were processed using BCLConvertor v1.7.1. All reads from the prepared libraries that passed the Illumina Chastity filter were formatted into fastq files. The fastq files were aligned to human reference genome build37 (NCBI) using BWA (bwa-0.5.9-R16) with default parameters with the following exceptions: seed sequence: 40 bpseed mismatch: 2, total mismatches allowed: 3. BAM files generated from alignment were preprocessed using GATK (v1.3-8-gb0e6afe) [1] to recalibrate and locally realign reads.

Mutation Detection

Sequence variants were called from tumor and matched normal BAM files using Atlas [2] an integrative variant analysis suite of tools specializing in the separation of true SNPs and insertions and deletions (indels) from sequencing and mapping errors in whole exome capture sequencing (WXS) data. The suite implements logistic regression models trained on validated WXS data to identify the true variants. ATLAS-SNP-2 (v1.3) [3] and ATLAS-Indel-2 (v0.3.1) along with Pindel (v0.2.4q) [4] were run on the BAM files producing variant data that were further filtered to remove all those observed fewer than 5 times or were present in less than 0.08 of the reads (e.g., variant allele fraction must be greater than 0.08 to undergo validation). At least one variant read of Q30 or better was required, and the variant had to lie in the central portion of the read (15% from the 5' end of the read and 20% from the 3' end). In addition, reads harboring the variant must have been observed in both forward and reverse orientations. Finally, the variant base was not observed in the normal tissue. Indels were discovered by similar processing except indels must have been observed in at least 10 of the reads.

Whole Exome Sequencing

*Protocols were performed at Baylor College of Medicine.

Library construction

Specimen processing, DNA extraction, standard QC and Illumina paired-end pre-capture libraries were prepared according to the manufacturer's protocol (Illumina Inc, San Diego, CA) with the following modifications: 0.5 - 1ug genomic DNA in 100ul volume was sheared into fragments of approximately 300 base pairs in a Covaris E210 system (Covaris, Inc. Woburn, MA). The setting was 10% duty cycle, intensity of 4,200 cycles per burst for 120 seconds. Fragment size was checked using a 2.2% Flash Gel DNA Cassette (Lonza, Walkersville, MD, Cat. No.57023). End-repair of fragmented DNA was performed in 90ul total reaction volume containing sheared DNA, 9 ul 10X buffer, 5 ul END Repair Enzyme Mix and H2O (NEBNext End-Repair Module, New England BioLabs, Ipswich, MA, Cat. No. E6050L), incubated at 20°C for 30 minutes. A-tailing was performed in a total reaction volume of 60ul containing end-repaired DNA, 6ul 10X buffer, 3ul Klenow fragment (NEBNext dA-Tailing Module; Cat. No. E6053L) and H2O followed by incubation at 37°C for 30 minutes. Illumina multiplex adapter ligation (NEBNext Quick Ligation Module Cat. No. E6056L) was performed in a total reaction volume of 90ul containing 18ul 5X buffer, 5ul ligase, 0.5ul 100uM adaptor and H2O at room temperature for 30 minutes. After ligation, PCR with Illumina PE 1.0 and modified barcode primers (manuscript in preparation) was performed in 170ul reactions containing 85ul of 2x Phusion High-Fidelity PCR master mix, adaptor ligated DNA, 1.75ul of 50uM primers and H2O. PCR was performed using a 5-minute initial denaturation at 95°C, 6-10 cycles of 15 seconds at 95°C, 15 seconds at 60°C and 30 seconds at 72°C followed by a final extension for 5 minute at 72°C. Agencourt XP Beads (Beckman Coulter Genomics, Inc., Danvers, MA, Cat. No. A63882) were used to purify DNA after each enzymatic reaction. After purification, PCR product quantification and size distribution was determined using the Caliper GX 1K/12K/High Sensitivity Assay Labchip (Hopkinton, MA, Cat. No. 760517).

Exome capture

Illumina pre-capture libraries (1ug DNA input) were hybridized in solution to SeqCap EZ Human Exome 2.0 (Nimblegen, Madison, WI) probes targeting approximately 44Mbs of sequence from approximately 30K genes according to the manufacturer's protocol with the following modifications: hybridization enhancing oligos IHE1, IHE2 and IHE3 replaced oligos HE1.1 and HE2.1 and post-capture LM-PCR was performed using 14 cycles. Capture libraries were quantified using Caliper GX 1K/12K/High Sensitivity Assay Labchip (Hopkinton, MA, Cat. No. 760517). The efficiency of the capture was evaluated by performing a qPCR-based quality check on the built-in controls (qPCR SYBR Green assays, Applied Biosystems, Grand Island, NY). Four standardized oligo sets, RUNX2, PRKG1, SMG1, and NLK, were employed as internal quality controls. The enrichment of the capture libraries was estimated to range from 7- to 9-fold over background.

Library templates were prepared for sequencing using Illumina's cBot cluster generation system with TruSeq PE Cluster Generation Kits (Part no. PE-401-3001). Briefly, these libraries were denatured with sodium hydroxide and diluted to 6-9 pM in hybridization buffer in order to achieve a load density of ~800K clusters/mm2. Each library pool was loaded in a single lane of a HiSeq flow cell, and each lane was spiked with 2% phiX control library for run quality control. The sample libraries then underwent bridge amplification to form clonal clusters, followed by hybridization with the sequencing primer. Sequencing runs were performed in paired-end mode using the Illumina HiSeq 2000 platform. Using the TruSeq SBS Kits (Part no. FC-401-3001), sequencing-by-synthesis reactions were extended for 101 cycles from each end, with an additional 7 cycles for the index read. Sequencing runs generated approximately 300-400 million successful reads on each lane of a flow cell, with approximately 9-10 Gb produced per sample. With these sequencing yields, samples achieved an average of 95% of the targeted exome bases covered to a depth of 20X or greater.

Real Time Analysis (RTA) software was used to process the image analysis and nucleotide base calling. On average, about 80-100 million successful reads, consisting of 2X 100 bp, were generated on each lane of a flow cell.

Whole Exome Sequencing

*Protocols were performed at Baylor College of Medicine.

Mapping Reads

Illumina HiSeq bcl files were processed using BCLConvertor v1.7.1. All reads from the prepared libraries that passed the Illumina Chastity filter were formatted into fastq files. The fastq files were aligned to human reference genome build37 (NCBI) using BWA (bwa-0.5.9-R16) with default parameters with the following exceptions: seed sequence: 40 bpseed mismatch: 2, total mismatches allowed: 3. BAM files generated from alignment were preprocessed using GATK (v1.3-8-gb0e6afe) [1] to recalibrate and locally realign reads.

Mutation Detection

Sequence variants were called from tumor and matched normal BAM files using Atlas [2] an integrative variant analysis suite of tools specializing in the separation of true SNPs and insertions and deletions (indels) from sequencing and mapping errors in whole exome capture sequencing (WXS) data. The suite implements logistic regression models trained on validated WXS data to identify the true variants. ATLAS-SNP-2 (v1.3) [3] and ATLAS-Indel-2 (v0.3.1) along with Pindel (v0.2.4q) [4] were run on the BAM files producing variant data that were further filtered to remove all those observed fewer than 5 times or were present in less than 0.08 of the reads (e.g., variant allele fraction must be greater than 0.08 to undergo validation). At least one variant read of Q30 or better was required, and the variant had to lie in the central portion of the read (15% from the 5' end of the read and 20% from the 3' end). In addition, reads harboring the variant must have been observed in both forward and reverse orientations. Finally, the variant base was not observed in the normal tissue. Indels were discovered by similar processing except indels must have been observed in at least 10 of the reads.

 

Whole Exome Sequencing

*Protocols were performed at Baylor College of Medicine.

Library construction

Specimen processing, DNA extraction, standard QC and Illumina paired-end pre-capture libraries were prepared according to the manufacturer's protocol (Illumina Inc, San Diego, CA) with the following modifications: 0.5 - 1ug genomic DNA in 100ul volume was sheared into fragments of approximately 300 base pairs in a Covaris E210 system (Covaris, Inc. Woburn, MA). The setting was 10% duty cycle, intensity of 4,200 cycles per burst for 120 seconds. Fragment size was checked using a 2.2% Flash Gel DNA Cassette (Lonza, Walkersville, MD, Cat. No.57023). End-repair of fragmented DNA was performed in 90ul total reaction volume containing sheared DNA, 9 ul 10X buffer, 5 ul END Repair Enzyme Mix and H2O (NEBNext End-Repair Module, New England BioLabs, Ipswich, MA, Cat. No. E6050L), incubated at 20°C for 30 minutes. A-tailing was performed in a total reaction volume of 60ul containing end-repaired DNA, 6ul 10X buffer, 3ul Klenow fragment (NEBNext dA-Tailing Module; Cat. No. E6053L) and H2O followed by incubation at 37°C for 30 minutes. Illumina multiplex adapter ligation (NEBNext Quick Ligation Module Cat. No. E6056L) was performed in a total reaction volume of 90ul containing 18ul 5X buffer, 5ul ligase, 0.5ul 100uM adaptor and H2O at room temperature for 30 minutes. After ligation, PCR with Illumina PE 1.0 and modified barcode primers (manuscript in preparation) was performed in 170ul reactions containing 85ul of 2x Phusion High-Fidelity PCR master mix, adaptor ligated DNA, 1.75ul of 50uM primers and H2O. PCR was performed using a 5-minute initial denaturation at 95°C, 6-10 cycles of 15 seconds at 95°C, 15 seconds at 60°C and 30 seconds at 72°C followed by a final extension for 5 minute at 72°C. Agencourt XP Beads (Beckman Coulter Genomics, Inc., Danvers, MA, Cat. No. A63882) were used to purify DNA after each enzymatic reaction. After purification, PCR product quantification and size distribution was determined using the Caliper GX 1K/12K/High Sensitivity Assay Labchip (Hopkinton, MA, Cat. No. 760517).

Exome capture

Illumina pre-capture libraries (1ug DNA input) were hybridized in solution to SeqCap EZ Human Exome 2.0 (Nimblegen, Madison, WI) probes targeting approximately 44Mbs of sequence from approximately 30K genes according to the manufacturer's protocol with the following modifications: hybridization enhancing oligos IHE1, IHE2 and IHE3 replaced oligos HE1.1 and HE2.1 and post-capture LM-PCR was performed using 14 cycles. Capture libraries were quantified using Caliper GX 1K/12K/High Sensitivity Assay Labchip (Hopkinton, MA, Cat. No. 760517). The efficiency of the capture was evaluated by performing a qPCR-based quality check on the built-in controls (qPCR SYBR Green assays, Applied Biosystems, Grand Island, NY). Four standardized oligo sets, RUNX2, PRKG1, SMG1, and NLK, were employed as internal quality controls. The enrichment of the capture libraries was estimated to range from 7- to 9-fold over background.

Library templates were prepared for sequencing using Illumina's cBot cluster generation system with TruSeq PE Cluster Generation Kits (Part no. PE-401-3001). Briefly, these libraries were denatured with sodium hydroxide and diluted to 6-9 pM in hybridization buffer in order to achieve a load density of ~800K clusters/mm2. Each library pool was loaded in a single lane of a HiSeq flow cell, and each lane was spiked with 2% phiX control library for run quality control. The sample libraries then underwent bridge amplification to form clonal clusters, followed by hybridization with the sequencing primer. Sequencing runs were performed in paired-end mode using the Illumina HiSeq 2000 platform. Using the TruSeq SBS Kits (Part no. FC-401-3001), sequencing-by-synthesis reactions were extended for 101 cycles from each end, with an additional 7 cycles for the index read. Sequencing runs generated approximately 300-400 million successful reads on each lane of a flow cell, with approximately 9-10 Gb produced per sample. With these sequencing yields, samples achieved an average of 95% of the targeted exome bases covered to a depth of 20X or greater.

Real Time Analysis (RTA) software was used to process the image analysis and nucleotide base calling. On average, about 80-100 million successful reads, consisting of 2X 100 bp, were generated on each lane of a flow cell.

Whole Exome Sequencing

*Protocols were performed at Baylor College of Medicine.

Mapping Reads

Illumina HiSeq bcl files were processed using BCLConvertor v1.7.1. All reads from the prepared libraries that passed the Illumina Chastity filter were formatted into fastq files. The fastq files were aligned to human reference genome build37 (NCBI) using BWA (bwa-0.5.9-R16) with default parameters with the following exceptions: seed sequence: 40 bpseed mismatch: 2, total mismatches allowed: 3. BAM files generated from alignment were preprocessed using GATK (v1.3-8-gb0e6afe) [1] to recalibrate and locally realign reads.

Mutation Detection

Sequence variants were called from tumor and matched normal BAM files using Atlas [2] an integrative variant analysis suite of tools specializing in the separation of true SNPs and insertions and deletions (indels) from sequencing and mapping errors in whole exome capture sequencing (WXS) data. The suite implements logistic regression models trained on validated WXS data to identify the true variants. ATLAS-SNP-2 (v1.3) [3] and ATLAS-Indel-2 (v0.3.1) along with Pindel (v0.2.4q) [4] were run on the BAM files producing variant data that were further filtered to remove all those observed fewer than 5 times or were present in less than 0.08 of the reads (e.g., variant allele fraction must be greater than 0.08 to undergo validation). At least one variant read of Q30 or better was required, and the variant had to lie in the central portion of the read (15% from the 5' end of the read and 20% from the 3' end). In addition, reads harboring the variant must have been observed in both forward and reverse orientations. Finally, the variant base was not observed in the normal tissue. Indels were discovered by similar processing except indels must have been observed in at least 10 of the reads.

Whole Exome Sequencing

*Protocols were performed at Baylor College of Medicine.

Library construction

Specimen processing, DNA extraction, standard QC and Illumina paired-end pre-capture libraries were prepared according to the manufacturer's protocol (Illumina Inc, San Diego, CA) with the following modifications: 0.5 - 1ug genomic DNA in 100ul volume was sheared into fragments of approximately 300 base pairs in a Covaris E210 system (Covaris, Inc. Woburn, MA). The setting was 10% duty cycle, intensity of 4,200 cycles per burst for 120 seconds. Fragment size was checked using a 2.2% Flash Gel DNA Cassette (Lonza, Walkersville, MD, Cat. No.57023). End-repair of fragmented DNA was performed in 90ul total reaction volume containing sheared DNA, 9 ul 10X buffer, 5 ul END Repair Enzyme Mix and H2O (NEBNext End-Repair Module, New England BioLabs, Ipswich, MA, Cat. No. E6050L), incubated at 20°C for 30 minutes. A-tailing was performed in a total reaction volume of 60ul containing end-repaired DNA, 6ul 10X buffer, 3ul Klenow fragment (NEBNext dA-Tailing Module; Cat. No. E6053L) and H2O followed by incubation at 37°C for 30 minutes. Illumina multiplex adapter ligation (NEBNext Quick Ligation Module Cat. No. E6056L) was performed in a total reaction volume of 90ul containing 18ul 5X buffer, 5ul ligase, 0.5ul 100uM adaptor and H2O at room temperature for 30 minutes. After ligation, PCR with Illumina PE 1.0 and modified barcode primers (manuscript in preparation) was performed in 170ul reactions containing 85ul of 2x Phusion High-Fidelity PCR master mix, adaptor ligated DNA, 1.75ul of 50uM primers and H2O. PCR was performed using a 5-minute initial denaturation at 95°C, 6-10 cycles of 15 seconds at 95°C, 15 seconds at 60°C and 30 seconds at 72°C followed by a final extension for 5 minute at 72°C. Agencourt XP Beads (Beckman Coulter Genomics, Inc., Danvers, MA, Cat. No. A63882) were used to purify DNA after each enzymatic reaction. After purification, PCR product quantification and size distribution was determined using the Caliper GX 1K/12K/High Sensitivity Assay Labchip (Hopkinton, MA, Cat. No. 760517).

Exome capture

Illumina pre-capture libraries (1ug DNA input) were hybridized in solution to SeqCap EZ Human Exome 2.0 (Nimblegen, Madison, WI) probes targeting approximately 44Mbs of sequence from approximately 30K genes according to the manufacturer's protocol with the following modifications: hybridization enhancing oligos IHE1, IHE2 and IHE3 replaced oligos HE1.1 and HE2.1 and post-capture LM-PCR was performed using 14 cycles. Capture libraries were quantified using Caliper GX 1K/12K/High Sensitivity Assay Labchip (Hopkinton, MA, Cat. No. 760517). The efficiency of the capture was evaluated by performing a qPCR-based quality check on the built-in controls (qPCR SYBR Green assays, Applied Biosystems, Grand Island, NY). Four standardized oligo sets, RUNX2, PRKG1, SMG1, and NLK, were employed as internal quality controls. The enrichment of the capture libraries was estimated to range from 7- to 9-fold over background.

Library templates were prepared for sequencing using Illumina's cBot cluster generation system with TruSeq PE Cluster Generation Kits (Part no. PE-401-3001). Briefly, these libraries were denatured with sodium hydroxide and diluted to 6-9 pM in hybridization buffer in order to achieve a load density of ~800K clusters/mm2. Each library pool was loaded in a single lane of a HiSeq flow cell, and each lane was spiked with 2% phiX control library for run quality control. The sample libraries then underwent bridge amplification to form clonal clusters, followed by hybridization with the sequencing primer. Sequencing runs were performed in paired-end mode using the Illumina HiSeq 2000 platform. Using the TruSeq SBS Kits (Part no. FC-401-3001), sequencing-by-synthesis reactions were extended for 101 cycles from each end, with an additional 7 cycles for the index read. Sequencing runs generated approximately 300-400 million successful reads on each lane of a flow cell, with approximately 9-10 Gb produced per sample. With these sequencing yields, samples achieved an average of 95% of the targeted exome bases covered to a depth of 20X or greater.

Real Time Analysis (RTA) software was used to process the image analysis and nucleotide base calling. On average, about 80-100 million successful reads, consisting of 2X 100 bp, were generated on each lane of a flow cell.

Whole Exome Sequencing

*Protocols were performed at Baylor College of Medicine.

Mapping Reads

Illumina HiSeq bcl files were processed using BCLConvertor v1.7.1. All reads from the prepared libraries that passed the Illumina Chastity filter were formatted into fastq files. The fastq files were aligned to human reference genome build37 (NCBI) using BWA (bwa-0.5.9-R16) with default parameters with the following exceptions: seed sequence: 40 bpseed mismatch: 2, total mismatches allowed: 3. BAM files generated from alignment were preprocessed using GATK (v1.3-8-gb0e6afe) [1] to recalibrate and locally realign reads.

Mutation Detection

Sequence variants were called from tumor and matched normal BAM files using Atlas [2] an integrative variant analysis suite of tools specializing in the separation of true SNPs and insertions and deletions (indels) from sequencing and mapping errors in whole exome capture sequencing (WXS) data. The suite implements logistic regression models trained on validated WXS data to identify the true variants. ATLAS-SNP-2 (v1.3) [3] and ATLAS-Indel-2 (v0.3.1) along with Pindel (v0.2.4q) [4] were run on the BAM files producing variant data that were further filtered to remove all those observed fewer than 5 times or were present in less than 0.08 of the reads (e.g., variant allele fraction must be greater than 0.08 to undergo validation). At least one variant read of Q30 or better was required, and the variant had to lie in the central portion of the read (15% from the 5' end of the read and 20% from the 3' end). In addition, reads harboring the variant must have been observed in both forward and reverse orientations. Finally, the variant base was not observed in the normal tissue. Indels were discovered by similar processing except indels must have been observed in at least 10 of the reads.

Whole Exome Sequencing

*Protocols were performed at Baylor College of Medicine.

Library construction

Specimen processing, DNA extraction, standard QC and Illumina paired-end pre-capture libraries were prepared according to the manufacturer's protocol (Illumina Inc, San Diego, CA) with the following modifications: 0.5 - 1ug genomic DNA in 100ul volume was sheared into fragments of approximately 300 base pairs in a Covaris E210 system (Covaris, Inc. Woburn, MA). The setting was 10% duty cycle, intensity of 4,200 cycles per burst for 120 seconds. Fragment size was checked using a 2.2% Flash Gel DNA Cassette (Lonza, Walkersville, MD, Cat. No.57023). End-repair of fragmented DNA was performed in 90ul total reaction volume containing sheared DNA, 9 ul 10X buffer, 5 ul END Repair Enzyme Mix and H2O (NEBNext End-Repair Module, New England BioLabs, Ipswich, MA, Cat. No. E6050L), incubated at 20°C for 30 minutes. A-tailing was performed in a total reaction volume of 60ul containing end-repaired DNA, 6ul 10X buffer, 3ul Klenow fragment (NEBNext dA-Tailing Module; Cat. No. E6053L) and H2O followed by incubation at 37°C for 30 minutes. Illumina multiplex adapter ligation (NEBNext Quick Ligation Module Cat. No. E6056L) was performed in a total reaction volume of 90ul containing 18ul 5X buffer, 5ul ligase, 0.5ul 100uM adaptor and H2O at room temperature for 30 minutes. After ligation, PCR with Illumina PE 1.0 and modified barcode primers (manuscript in preparation) was performed in 170ul reactions containing 85ul of 2x Phusion High-Fidelity PCR master mix, adaptor ligated DNA, 1.75ul of 50uM primers and H2O. PCR was performed using a 5-minute initial denaturation at 95°C, 6-10 cycles of 15 seconds at 95°C, 15 seconds at 60°C and 30 seconds at 72°C followed by a final extension for 5 minute at 72°C. Agencourt XP Beads (Beckman Coulter Genomics, Inc., Danvers, MA, Cat. No. A63882) were used to purify DNA after each enzymatic reaction. After purification, PCR product quantification and size distribution was determined using the Caliper GX 1K/12K/High Sensitivity Assay Labchip (Hopkinton, MA, Cat. No. 760517).

Exome capture

Illumina pre-capture libraries (1ug DNA input) were hybridized in solution to SeqCap EZ Human Exome 2.0 (Nimblegen, Madison, WI) probes targeting approximately 44Mbs of sequence from approximately 30K genes according to the manufacturer's protocol with the following modifications: hybridization enhancing oligos IHE1, IHE2 and IHE3 replaced oligos HE1.1 and HE2.1 and post-capture LM-PCR was performed using 14 cycles. Capture libraries were quantified using Caliper GX 1K/12K/High Sensitivity Assay Labchip (Hopkinton, MA, Cat. No. 760517). The efficiency of the capture was evaluated by performing a qPCR-based quality check on the built-in controls (qPCR SYBR Green assays, Applied Biosystems, Grand Island, NY). Four standardized oligo sets, RUNX2, PRKG1, SMG1, and NLK, were employed as internal quality controls. The enrichment of the capture libraries was estimated to range from 7- to 9-fold over background.

Library templates were prepared for sequencing using Illumina's cBot cluster generation system with TruSeq PE Cluster Generation Kits (Part no. PE-401-3001). Briefly, these libraries were denatured with sodium hydroxide and diluted to 6-9 pM in hybridization buffer in order to achieve a load density of ~800K clusters/mm2. Each library pool was loaded in a single lane of a HiSeq flow cell, and each lane was spiked with 2% phiX control library for run quality control. The sample libraries then underwent bridge amplification to form clonal clusters, followed by hybridization with the sequencing primer. Sequencing runs were performed in paired-end mode using the Illumina HiSeq 2000 platform. Using the TruSeq SBS Kits (Part no. FC-401-3001), sequencing-by-synthesis reactions were extended for 101 cycles from each end, with an additional 7 cycles for the index read. Sequencing runs generated approximately 300-400 million successful reads on each lane of a flow cell, with approximately 9-10 Gb produced per sample. With these sequencing yields, samples achieved an average of 95% of the targeted exome bases covered to a depth of 20X or greater.

Real Time Analysis (RTA) software was used to process the image analysis and nucleotide base calling. On average, about 80-100 million successful reads, consisting of 2X 100 bp, were generated on each lane of a flow cell.

Whole Exome Sequencing

*Protocols were performed at Baylor College of Medicine.

Mapping Reads

Illumina HiSeq bcl files were processed using BCLConvertor v1.7.1. All reads from the prepared libraries that passed the Illumina Chastity filter were formatted into fastq files. The fastq files were aligned to human reference genome build37 (NCBI) using BWA (bwa-0.5.9-R16) with default parameters with the following exceptions: seed sequence: 40 bpseed mismatch: 2, total mismatches allowed: 3. BAM files generated from alignment were preprocessed using GATK (v1.3-8-gb0e6afe) [1] to recalibrate and locally realign reads.

Mutation Detection

Sequence variants were called from tumor and matched normal BAM files using Atlas [2] an integrative variant analysis suite of tools specializing in the separation of true SNPs and insertions and deletions (indels) from sequencing and mapping errors in whole exome capture sequencing (WXS) data. The suite implements logistic regression models trained on validated WXS data to identify the true variants. ATLAS-SNP-2 (v1.3) [3] and ATLAS-Indel-2 (v0.3.1) along with Pindel (v0.2.4q) [4] were run on the BAM files producing variant data that were further filtered to remove all those observed fewer than 5 times or were present in less than 0.08 of the reads (e.g., variant allele fraction must be greater than 0.08 to undergo validation). At least one variant read of Q30 or better was required, and the variant had to lie in the central portion of the read (15% from the 5' end of the read and 20% from the 3' end). In addition, reads harboring the variant must have been observed in both forward and reverse orientations. Finally, the variant base was not observed in the normal tissue. Indels were discovered by similar processing except indels must have been observed in at least 10 of the reads.

Whole Exome Sequencing

*Protocols were performed at Baylor College of Medicine.

Library construction

Specimen processing, DNA extraction, standard QC and Illumina paired-end pre-capture libraries were prepared according to the manufacturer's protocol (Illumina Inc, San Diego, CA) with the following modifications: 0.5 - 1ug genomic DNA in 100ul volume was sheared into fragments of approximately 300 base pairs in a Covaris E210 system (Covaris, Inc. Woburn, MA). The setting was 10% duty cycle, intensity of 4,200 cycles per burst for 120 seconds. Fragment size was checked using a 2.2% Flash Gel DNA Cassette (Lonza, Walkersville, MD, Cat. No.57023). End-repair of fragmented DNA was performed in 90ul total reaction volume containing sheared DNA, 9 ul 10X buffer, 5 ul END Repair Enzyme Mix and H2O (NEBNext End-Repair Module, New England BioLabs, Ipswich, MA, Cat. No. E6050L), incubated at 20°C for 30 minutes. A-tailing was performed in a total reaction volume of 60ul containing end-repaired DNA, 6ul 10X buffer, 3ul Klenow fragment (NEBNext dA-Tailing Module; Cat. No. E6053L) and H2O followed by incubation at 37°C for 30 minutes. Illumina multiplex adapter ligation (NEBNext Quick Ligation Module Cat. No. E6056L) was performed in a total reaction volume of 90ul containing 18ul 5X buffer, 5ul ligase, 0.5ul 100uM adaptor and H2O at room temperature for 30 minutes. After ligation, PCR with Illumina PE 1.0 and modified barcode primers (manuscript in preparation) was performed in 170ul reactions containing 85ul of 2x Phusion High-Fidelity PCR master mix, adaptor ligated DNA, 1.75ul of 50uM primers and H2O. PCR was performed using a 5-minute initial denaturation at 95°C, 6-10 cycles of 15 seconds at 95°C, 15 seconds at 60°C and 30 seconds at 72°C followed by a final extension for 5 minute at 72°C. Agencourt XP Beads (Beckman Coulter Genomics, Inc., Danvers, MA, Cat. No. A63882) were used to purify DNA after each enzymatic reaction. After purification, PCR product quantification and size distribution was determined using the Caliper GX 1K/12K/High Sensitivity Assay Labchip (Hopkinton, MA, Cat. No. 760517).

Exome capture

Illumina pre-capture libraries (1ug DNA input) were hybridized in solution to SeqCap EZ Human Exome 2.0 (Nimblegen, Madison, WI) probes targeting approximately 44Mbs of sequence from approximately 30K genes according to the manufacturer's protocol with the following modifications: hybridization enhancing oligos IHE1, IHE2 and IHE3 replaced oligos HE1.1 and HE2.1 and post-capture LM-PCR was performed using 14 cycles. Capture libraries were quantified using Caliper GX 1K/12K/High Sensitivity Assay Labchip (Hopkinton, MA, Cat. No. 760517). The efficiency of the capture was evaluated by performing a qPCR-based quality check on the built-in controls (qPCR SYBR Green assays, Applied Biosystems, Grand Island, NY). Four standardized oligo sets, RUNX2, PRKG1, SMG1, and NLK, were employed as internal quality controls. The enrichment of the capture libraries was estimated to range from 7- to 9-fold over background.

Library templates were prepared for sequencing using Illumina's cBot cluster generation system with TruSeq PE Cluster Generation Kits (Part no. PE-401-3001). Briefly, these libraries were denatured with sodium hydroxide and diluted to 6-9 pM in hybridization buffer in order to achieve a load density of ~800K clusters/mm2. Each library pool was loaded in a single lane of a HiSeq flow cell, and each lane was spiked with 2% phiX control library for run quality control. The sample libraries then underwent bridge amplification to form clonal clusters, followed by hybridization with the sequencing primer. Sequencing runs were performed in paired-end mode using the Illumina HiSeq 2000 platform. Using the TruSeq SBS Kits (Part no. FC-401-3001), sequencing-by-synthesis reactions were extended for 101 cycles from each end, with an additional 7 cycles for the index read. Sequencing runs generated approximately 300-400 million successful reads on each lane of a flow cell, with approximately 9-10 Gb produced per sample. With these sequencing yields, samples achieved an average of 95% of the targeted exome bases covered to a depth of 20X or greater.

Real Time Analysis (RTA) software was used to process the image analysis and nucleotide base calling. On average, about 80-100 million successful reads, consisting of 2X 100 bp, were generated on each lane of a flow cell.

Whole Exome Sequencing

*Protocols were performed at Baylor College of Medicine.

Mapping Reads

Illumina HiSeq bcl files were processed using BCLConvertor v1.7.1. All reads from the prepared libraries that passed the Illumina Chastity filter were formatted into fastq files. The fastq files were aligned to human reference genome build37 (NCBI) using BWA (bwa-0.5.9-R16) with default parameters with the following exceptions: seed sequence: 40 bpseed mismatch: 2, total mismatches allowed: 3. BAM files generated from alignment were preprocessed using GATK (v1.3-8-gb0e6afe) [1] to recalibrate and locally realign reads.

Mutation Detection

Sequence variants were called from tumor and matched normal BAM files using Atlas [2] an integrative variant analysis suite of tools specializing in the separation of true SNPs and insertions and deletions (indels) from sequencing and mapping errors in whole exome capture sequencing (WXS) data. The suite implements logistic regression models trained on validated WXS data to identify the true variants. ATLAS-SNP-2 (v1.3) [3] and ATLAS-Indel-2 (v0.3.1) along with Pindel (v0.2.4q) [4] were run on the BAM files producing variant data that were further filtered to remove all those observed fewer than 5 times or were present in less than 0.08 of the reads (e.g., variant allele fraction must be greater than 0.08 to undergo validation). At least one variant read of Q30 or better was required, and the variant had to lie in the central portion of the read (15% from the 5' end of the read and 20% from the 3' end). In addition, reads harboring the variant must have been observed in both forward and reverse orientations. Finally, the variant base was not observed in the normal tissue. Indels were discovered by similar processing except indels must have been observed in at least 10 of the reads.

Whole Exome Sequencing

*Protocols were performed at Baylor College of Medicine.

Library construction

Specimen processing, DNA extraction, standard QC and Illumina paired-end pre-capture libraries were prepared according to the manufacturer's protocol (Illumina Inc, San Diego, CA) with the following modifications: 0.5 - 1ug genomic DNA in 100ul volume was sheared into fragments of approximately 300 base pairs in a Covaris E210 system (Covaris, Inc. Woburn, MA). The setting was 10% duty cycle, intensity of 4,200 cycles per burst for 120 seconds. Fragment size was checked using a 2.2% Flash Gel DNA Cassette (Lonza, Walkersville, MD, Cat. No.57023). End-repair of fragmented DNA was performed in 90ul total reaction volume containing sheared DNA, 9 ul 10X buffer, 5 ul END Repair Enzyme Mix and H2O (NEBNext End-Repair Module, New England BioLabs, Ipswich, MA, Cat. No. E6050L), incubated at 20°C for 30 minutes. A-tailing was performed in a total reaction volume of 60ul containing end-repaired DNA, 6ul 10X buffer, 3ul Klenow fragment (NEBNext dA-Tailing Module; Cat. No. E6053L) and H2O followed by incubation at 37°C for 30 minutes. Illumina multiplex adapter ligation (NEBNext Quick Ligation Module Cat. No. E6056L) was performed in a total reaction volume of 90ul containing 18ul 5X buffer, 5ul ligase, 0.5ul 100uM adaptor and H2O at room temperature for 30 minutes. After ligation, PCR with Illumina PE 1.0 and modified barcode primers (manuscript in preparation) was performed in 170ul reactions containing 85ul of 2x Phusion High-Fidelity PCR master mix, adaptor ligated DNA, 1.75ul of 50uM primers and H2O. PCR was performed using a 5-minute initial denaturation at 95°C, 6-10 cycles of 15 seconds at 95°C, 15 seconds at 60°C and 30 seconds at 72°C followed by a final extension for 5 minute at 72°C. Agencourt XP Beads (Beckman Coulter Genomics, Inc., Danvers, MA, Cat. No. A63882) were used to purify DNA after each enzymatic reaction. After purification, PCR product quantification and size distribution was determined using the Caliper GX 1K/12K/High Sensitivity Assay Labchip (Hopkinton, MA, Cat. No. 760517).

Exome capture

Illumina pre-capture libraries (1ug DNA input) were hybridized in solution to SeqCap EZ Human Exome 2.0 (Nimblegen, Madison, WI) probes targeting approximately 44Mbs of sequence from approximately 30K genes according to the manufacturer's protocol with the following modifications: hybridization enhancing oligos IHE1, IHE2 and IHE3 replaced oligos HE1.1 and HE2.1 and post-capture LM-PCR was performed using 14 cycles. Capture libraries were quantified using Caliper GX 1K/12K/High Sensitivity Assay Labchip (Hopkinton, MA, Cat. No. 760517). The efficiency of the capture was evaluated by performing a qPCR-based quality check on the built-in controls (qPCR SYBR Green assays, Applied Biosystems, Grand Island, NY). Four standardized oligo sets, RUNX2, PRKG1, SMG1, and NLK, were employed as internal quality controls. The enrichment of the capture libraries was estimated to range from 7- to 9-fold over background.

Library templates were prepared for sequencing using Illumina's cBot cluster generation system with TruSeq PE Cluster Generation Kits (Part no. PE-401-3001). Briefly, these libraries were denatured with sodium hydroxide and diluted to 6-9 pM in hybridization buffer in order to achieve a load density of ~800K clusters/mm2. Each library pool was loaded in a single lane of a HiSeq flow cell, and each lane was spiked with 2% phiX control library for run quality control. The sample libraries then underwent bridge amplification to form clonal clusters, followed by hybridization with the sequencing primer. Sequencing runs were performed in paired-end mode using the Illumina HiSeq 2000 platform. Using the TruSeq SBS Kits (Part no. FC-401-3001), sequencing-by-synthesis reactions were extended for 101 cycles from each end, with an additional 7 cycles for the index read. Sequencing runs generated approximately 300-400 million successful reads on each lane of a flow cell, with approximately 9-10 Gb produced per sample. With these sequencing yields, samples achieved an average of 95% of the targeted exome bases covered to a depth of 20X or greater.

Real Time Analysis (RTA) software was used to process the image analysis and nucleotide base calling. On average, about 80-100 million successful reads, consisting of 2X 100 bp, were generated on each lane of a flow cell.

Whole Exome Sequencing

*Protocols were performed at Baylor College of Medicine.

Mapping Reads

Illumina HiSeq bcl files were processed using BCLConvertor v1.7.1. All reads from the prepared libraries that passed the Illumina Chastity filter were formatted into fastq files. The fastq files were aligned to human reference genome build37 (NCBI) using BWA (bwa-0.5.9-R16) with default parameters with the following exceptions: seed sequence: 40 bpseed mismatch: 2, total mismatches allowed: 3. BAM files generated from alignment were preprocessed using GATK (v1.3-8-gb0e6afe) [1] to recalibrate and locally realign reads.

Mutation Detection

Sequence variants were called from tumor and matched normal BAM files using Atlas [2] an integrative variant analysis suite of tools specializing in the separation of true SNPs and insertions and deletions (indels) from sequencing and mapping errors in whole exome capture sequencing (WXS) data. The suite implements logistic regression models trained on validated WXS data to identify the true variants. ATLAS-SNP-2 (v1.3) [3] and ATLAS-Indel-2 (v0.3.1) along with Pindel (v0.2.4q) [4] were run on the BAM files producing variant data that were further filtered to remove all those observed fewer than 5 times or were present in less than 0.08 of the reads (e.g., variant allele fraction must be greater than 0.08 to undergo validation). At least one variant read of Q30 or better was required, and the variant had to lie in the central portion of the read (15% from the 5' end of the read and 20% from the 3' end). In addition, reads harboring the variant must have been observed in both forward and reverse orientations. Finally, the variant base was not observed in the normal tissue. Indels were discovered by similar processing except indels must have been observed in at least 10 of the reads.

 

Whole Exome Sequencing

*Protocols were performed at Baylor College of Medicine.

Library construction

Specimen processing, DNA extraction, standard QC and Illumina paired-end pre-capture libraries were prepared according to the manufacturer's protocol (Illumina Inc, San Diego, CA) with the following modifications: 0.5 - 1ug genomic DNA in 100ul volume was sheared into fragments of approximately 300 base pairs in a Covaris E210 system (Covaris, Inc. Woburn, MA). The setting was 10% duty cycle, intensity of 4,200 cycles per burst for 120 seconds. Fragment size was checked using a 2.2% Flash Gel DNA Cassette (Lonza, Walkersville, MD, Cat. No.57023). End-repair of fragmented DNA was performed in 90ul total reaction volume containing sheared DNA, 9 ul 10X buffer, 5 ul END Repair Enzyme Mix and H2O (NEBNext End-Repair Module, New England BioLabs, Ipswich, MA, Cat. No. E6050L), incubated at 20°C for 30 minutes. A-tailing was performed in a total reaction volume of 60ul containing end-repaired DNA, 6ul 10X buffer, 3ul Klenow fragment (NEBNext dA-Tailing Module; Cat. No. E6053L) and H2O followed by incubation at 37°C for 30 minutes. Illumina multiplex adapter ligation (NEBNext Quick Ligation Module Cat. No. E6056L) was performed in a total reaction volume of 90ul containing 18ul 5X buffer, 5ul ligase, 0.5ul 100uM adaptor and H2O at room temperature for 30 minutes. After ligation, PCR with Illumina PE 1.0 and modified barcode primers (manuscript in preparation) was performed in 170ul reactions containing 85ul of 2x Phusion High-Fidelity PCR master mix, adaptor ligated DNA, 1.75ul of 50uM primers and H2O. PCR was performed using a 5-minute initial denaturation at 95°C, 6-10 cycles of 15 seconds at 95°C, 15 seconds at 60°C and 30 seconds at 72°C followed by a final extension for 5 minute at 72°C. Agencourt XP Beads (Beckman Coulter Genomics, Inc., Danvers, MA, Cat. No. A63882) were used to purify DNA after each enzymatic reaction. After purification, PCR product quantification and size distribution was determined using the Caliper GX 1K/12K/High Sensitivity Assay Labchip (Hopkinton, MA, Cat. No. 760517).

Exome capture

Illumina pre-capture libraries (1ug DNA input) were hybridized in solution to SeqCap EZ Human Exome 2.0 (Nimblegen, Madison, WI) probes targeting approximately 44Mbs of sequence from approximately 30K genes according to the manufacturer's protocol with the following modifications: hybridization enhancing oligos IHE1, IHE2 and IHE3 replaced oligos HE1.1 and HE2.1 and post-capture LM-PCR was performed using 14 cycles. Capture libraries were quantified using Caliper GX 1K/12K/High Sensitivity Assay Labchip (Hopkinton, MA, Cat. No. 760517). The efficiency of the capture was evaluated by performing a qPCR-based quality check on the built-in controls (qPCR SYBR Green assays, Applied Biosystems, Grand Island, NY). Four standardized oligo sets, RUNX2, PRKG1, SMG1, and NLK, were employed as internal quality controls. The enrichment of the capture libraries was estimated to range from 7- to 9-fold over background.

Library templates were prepared for sequencing using Illumina's cBot cluster generation system with TruSeq PE Cluster Generation Kits (Part no. PE-401-3001). Briefly, these libraries were denatured with sodium hydroxide and diluted to 6-9 pM in hybridization buffer in order to achieve a load density of ~800K clusters/mm2. Each library pool was loaded in a single lane of a HiSeq flow cell, and each lane was spiked with 2% phiX control library for run quality control. The sample libraries then underwent bridge amplification to form clonal clusters, followed by hybridization with the sequencing primer. Sequencing runs were performed in paired-end mode using the Illumina HiSeq 2000 platform. Using the TruSeq SBS Kits (Part no. FC-401-3001), sequencing-by-synthesis reactions were extended for 101 cycles from each end, with an additional 7 cycles for the index read. Sequencing runs generated approximately 300-400 million successful reads on each lane of a flow cell, with approximately 9-10 Gb produced per sample. With these sequencing yields, samples achieved an average of 95% of the targeted exome bases covered to a depth of 20X or greater.

Real Time Analysis (RTA) software was used to process the image analysis and nucleotide base calling. On average, about 80-100 million successful reads, consisting of 2X 100 bp, were generated on each lane of a flow cell.

Whole Exome Sequencing

*Protocols were performed at Baylor College of Medicine.

Mapping Reads

Illumina HiSeq bcl files were processed using BCLConvertor v1.7.1. All reads from the prepared libraries that passed the Illumina Chastity filter were formatted into fastq files. The fastq files were aligned to human reference genome build37 (NCBI) using BWA (bwa-0.5.9-R16) with default parameters with the following exceptions: seed sequence: 40 bpseed mismatch: 2, total mismatches allowed: 3. BAM files generated from alignment were preprocessed using GATK (v1.3-8-gb0e6afe) [1] to recalibrate and locally realign reads.

Mutation Detection

Sequence variants were called from tumor and matched normal BAM files using Atlas [2] an integrative variant analysis suite of tools specializing in the separation of true SNPs and insertions and deletions (indels) from sequencing and mapping errors in whole exome capture sequencing (WXS) data. The suite implements logistic regression models trained on validated WXS data to identify the true variants. ATLAS-SNP-2 (v1.3) [3] and ATLAS-Indel-2 (v0.3.1) along with Pindel (v0.2.4q) [4] were run on the BAM files producing variant data that were further filtered to remove all those observed fewer than 5 times or were present in less than 0.08 of the reads (e.g., variant allele fraction must be greater than 0.08 to undergo validation). At least one variant read of Q30 or better was required, and the variant had to lie in the central portion of the read (15% from the 5' end of the read and 20% from the 3' end). In addition, reads harboring the variant must have been observed in both forward and reverse orientations. Finally, the variant base was not observed in the normal tissue. Indels were discovered by similar processing except indels must have been observed in at least 10 of the reads.

Whole Exome Sequencing

*Protocols were performed at Baylor College of Medicine.

Library construction

Specimen processing, DNA extraction, standard QC and Illumina paired-end pre-capture libraries were prepared according to the manufacturer's protocol (Illumina Inc, San Diego, CA) with the following modifications: 0.5 - 1ug genomic DNA in 100ul volume was sheared into fragments of approximately 300 base pairs in a Covaris E210 system (Covaris, Inc. Woburn, MA). The setting was 10% duty cycle, intensity of 4,200 cycles per burst for 120 seconds. Fragment size was checked using a 2.2% Flash Gel DNA Cassette (Lonza, Walkersville, MD, Cat. No.57023). End-repair of fragmented DNA was performed in 90ul total reaction volume containing sheared DNA, 9 ul 10X buffer, 5 ul END Repair Enzyme Mix and H2O (NEBNext End-Repair Module, New England BioLabs, Ipswich, MA, Cat. No. E6050L), incubated at 20°C for 30 minutes. A-tailing was performed in a total reaction volume of 60ul containing end-repaired DNA, 6ul 10X buffer, 3ul Klenow fragment (NEBNext dA-Tailing Module; Cat. No. E6053L) and H2O followed by incubation at 37°C for 30 minutes. Illumina multiplex adapter ligation (NEBNext Quick Ligation Module Cat. No. E6056L) was performed in a total reaction volume of 90ul containing 18ul 5X buffer, 5ul ligase, 0.5ul 100uM adaptor and H2O at room temperature for 30 minutes. After ligation, PCR with Illumina PE 1.0 and modified barcode primers (manuscript in preparation) was performed in 170ul reactions containing 85ul of 2x Phusion High-Fidelity PCR master mix, adaptor ligated DNA, 1.75ul of 50uM primers and H2O. PCR was performed using a 5-minute initial denaturation at 95°C, 6-10 cycles of 15 seconds at 95°C, 15 seconds at 60°C and 30 seconds at 72°C followed by a final extension for 5 minute at 72°C. Agencourt XP Beads (Beckman Coulter Genomics, Inc., Danvers, MA, Cat. No. A63882) were used to purify DNA after each enzymatic reaction. After purification, PCR product quantification and size distribution was determined using the Caliper GX 1K/12K/High Sensitivity Assay Labchip (Hopkinton, MA, Cat. No. 760517).

Exome capture

Illumina pre-capture libraries (1ug DNA input) were hybridized in solution to SeqCap EZ Human Exome 2.0 (Nimblegen, Madison, WI) probes targeting approximately 44Mbs of sequence from approximately 30K genes according to the manufacturer's protocol with the following modifications: hybridization enhancing oligos IHE1, IHE2 and IHE3 replaced oligos HE1.1 and HE2.1 and post-capture LM-PCR was performed using 14 cycles. Capture libraries were quantified using Caliper GX 1K/12K/High Sensitivity Assay Labchip (Hopkinton, MA, Cat. No. 760517). The efficiency of the capture was evaluated by performing a qPCR-based quality check on the built-in controls (qPCR SYBR Green assays, Applied Biosystems, Grand Island, NY). Four standardized oligo sets, RUNX2, PRKG1, SMG1, and NLK, were employed as internal quality controls. The enrichment of the capture libraries was estimated to range from 7- to 9-fold over background.

Library templates were prepared for sequencing using Illumina's cBot cluster generation system with TruSeq PE Cluster Generation Kits (Part no. PE-401-3001). Briefly, these libraries were denatured with sodium hydroxide and diluted to 6-9 pM in hybridization buffer in order to achieve a load density of ~800K clusters/mm2. Each library pool was loaded in a single lane of a HiSeq flow cell, and each lane was spiked with 2% phiX control library for run quality control. The sample libraries then underwent bridge amplification to form clonal clusters, followed by hybridization with the sequencing primer. Sequencing runs were performed in paired-end mode using the Illumina HiSeq 2000 platform. Using the TruSeq SBS Kits (Part no. FC-401-3001), sequencing-by-synthesis reactions were extended for 101 cycles from each end, with an additional 7 cycles for the index read. Sequencing runs generated approximately 300-400 million successful reads on each lane of a flow cell, with approximately 9-10 Gb produced per sample. With these sequencing yields, samples achieved an average of 95% of the targeted exome bases covered to a depth of 20X or greater.

Real Time Analysis (RTA) software was used to process the image analysis and nucleotide base calling. On average, about 80-100 million successful reads, consisting of 2X 100 bp, were generated on each lane of a flow cell.

Whole Exome Sequencing

*Protocols were performed at Baylor College of Medicine.

Mapping Reads

Illumina HiSeq bcl files were processed using BCLConvertor v1.7.1. All reads from the prepared libraries that passed the Illumina Chastity filter were formatted into fastq files. The fastq files were aligned to human reference genome build37 (NCBI) using BWA (bwa-0.5.9-R16) with default parameters with the following exceptions: seed sequence: 40 bpseed mismatch: 2, total mismatches allowed: 3. BAM files generated from alignment were preprocessed using GATK (v1.3-8-gb0e6afe) [1] to recalibrate and locally realign reads.

Mutation Detection

Sequence variants were called from tumor and matched normal BAM files using Atlas [2] an integrative variant analysis suite of tools specializing in the separation of true SNPs and insertions and deletions (indels) from sequencing and mapping errors in whole exome capture sequencing (WXS) data. The suite implements logistic regression models trained on validated WXS data to identify the true variants. ATLAS-SNP-2 (v1.3) [3] and ATLAS-Indel-2 (v0.3.1) along with Pindel (v0.2.4q) [4] were run on the BAM files producing variant data that were further filtered to remove all those observed fewer than 5 times or were present in less than 0.08 of the reads (e.g., variant allele fraction must be greater than 0.08 to undergo validation). At least one variant read of Q30 or better was required, and the variant had to lie in the central portion of the read (15% from the 5' end of the read and 20% from the 3' end). In addition, reads harboring the variant must have been observed in both forward and reverse orientations. Finally, the variant base was not observed in the normal tissue. Indels were discovered by similar processing except indels must have been observed in at least 10 of the reads.

Whole Exome Sequencing

*Protocols were performed at Baylor College of Medicine.

Library construction

Specimen processing, DNA extraction, standard QC and Illumina paired-end pre-capture libraries were prepared according to the manufacturer's protocol (Illumina Inc, San Diego, CA) with the following modifications: 0.5 - 1ug genomic DNA in 100ul volume was sheared into fragments of approximately 300 base pairs in a Covaris E210 system (Covaris, Inc. Woburn, MA). The setting was 10% duty cycle, intensity of 4,200 cycles per burst for 120 seconds. Fragment size was checked using a 2.2% Flash Gel DNA Cassette (Lonza, Walkersville, MD, Cat. No.57023). End-repair of fragmented DNA was performed in 90ul total reaction volume containing sheared DNA, 9 ul 10X buffer, 5 ul END Repair Enzyme Mix and H2O (NEBNext End-Repair Module, New England BioLabs, Ipswich, MA, Cat. No. E6050L), incubated at 20°C for 30 minutes. A-tailing was performed in a total reaction volume of 60ul containing end-repaired DNA, 6ul 10X buffer, 3ul Klenow fragment (NEBNext dA-Tailing Module; Cat. No. E6053L) and H2O followed by incubation at 37°C for 30 minutes. Illumina multiplex adapter ligation (NEBNext Quick Ligation Module Cat. No. E6056L) was performed in a total reaction volume of 90ul containing 18ul 5X buffer, 5ul ligase, 0.5ul 100uM adaptor and H2O at room temperature for 30 minutes. After ligation, PCR with Illumina PE 1.0 and modified barcode primers (manuscript in preparation) was performed in 170ul reactions containing 85ul of 2x Phusion High-Fidelity PCR master mix, adaptor ligated DNA, 1.75ul of 50uM primers and H2O. PCR was performed using a 5-minute initial denaturation at 95°C, 6-10 cycles of 15 seconds at 95°C, 15 seconds at 60°C and 30 seconds at 72°C followed by a final extension for 5 minute at 72°C. Agencourt XP Beads (Beckman Coulter Genomics, Inc., Danvers, MA, Cat. No. A63882) were used to purify DNA after each enzymatic reaction. After purification, PCR product quantification and size distribution was determined using the Caliper GX 1K/12K/High Sensitivity Assay Labchip (Hopkinton, MA, Cat. No. 760517).

Exome capture

Illumina pre-capture libraries (1ug DNA input) were hybridized in solution to SeqCap EZ Human Exome 2.0 (Nimblegen, Madison, WI) probes targeting approximately 44Mbs of sequence from approximately 30K genes according to the manufacturer's protocol with the following modifications: hybridization enhancing oligos IHE1, IHE2 and IHE3 replaced oligos HE1.1 and HE2.1 and post-capture LM-PCR was performed using 14 cycles. Capture libraries were quantified using Caliper GX 1K/12K/High Sensitivity Assay Labchip (Hopkinton, MA, Cat. No. 760517). The efficiency of the capture was evaluated by performing a qPCR-based quality check on the built-in controls (qPCR SYBR Green assays, Applied Biosystems, Grand Island, NY). Four standardized oligo sets, RUNX2, PRKG1, SMG1, and NLK, were employed as internal quality controls. The enrichment of the capture libraries was estimated to range from 7- to 9-fold over background.

Library templates were prepared for sequencing using Illumina's cBot cluster generation system with TruSeq PE Cluster Generation Kits (Part no. PE-401-3001). Briefly, these libraries were denatured with sodium hydroxide and diluted to 6-9 pM in hybridization buffer in order to achieve a load density of ~800K clusters/mm2. Each library pool was loaded in a single lane of a HiSeq flow cell, and each lane was spiked with 2% phiX control library for run quality control. The sample libraries then underwent bridge amplification to form clonal clusters, followed by hybridization with the sequencing primer. Sequencing runs were performed in paired-end mode using the Illumina HiSeq 2000 platform. Using the TruSeq SBS Kits (Part no. FC-401-3001), sequencing-by-synthesis reactions were extended for 101 cycles from each end, with an additional 7 cycles for the index read. Sequencing runs generated approximately 300-400 million successful reads on each lane of a flow cell, with approximately 9-10 Gb produced per sample. With these sequencing yields, samples achieved an average of 95% of the targeted exome bases covered to a depth of 20X or greater.

Real Time Analysis (RTA) software was used to process the image analysis and nucleotide base calling. On average, about 80-100 million successful reads, consisting of 2X 100 bp, were generated on each lane of a flow cell.

Whole Exome Sequencing

*Protocols were performed at Baylor College of Medicine.

Mapping Reads

Illumina HiSeq bcl files were processed using BCLConvertor v1.7.1. All reads from the prepared libraries that passed the Illumina Chastity filter were formatted into fastq files. The fastq files were aligned to human reference genome build37 (NCBI) using BWA (bwa-0.5.9-R16) with default parameters with the following exceptions: seed sequence: 40 bpseed mismatch: 2, total mismatches allowed: 3. BAM files generated from alignment were preprocessed using GATK (v1.3-8-gb0e6afe) [1] to recalibrate and locally realign reads.

Mutation Detection

Sequence variants were called from tumor and matched normal BAM files using Atlas [2] an integrative variant analysis suite of tools specializing in the separation of true SNPs and insertions and deletions (indels) from sequencing and mapping errors in whole exome capture sequencing (WXS) data. The suite implements logistic regression models trained on validated WXS data to identify the true variants. ATLAS-SNP-2 (v1.3) [3] and ATLAS-Indel-2 (v0.3.1) along with Pindel (v0.2.4q) [4] were run on the BAM files producing variant data that were further filtered to remove all those observed fewer than 5 times or were present in less than 0.08 of the reads (e.g., variant allele fraction must be greater than 0.08 to undergo validation). At least one variant read of Q30 or better was required, and the variant had to lie in the central portion of the read (15% from the 5' end of the read and 20% from the 3' end). In addition, reads harboring the variant must have been observed in both forward and reverse orientations. Finally, the variant base was not observed in the normal tissue. Indels were discovered by similar processing except indels must have been observed in at least 10 of the reads.

Whole Exome Sequencing

*Protocols were performed at Baylor College of Medicine.

Library construction

Specimen processing, DNA extraction, standard QC and Illumina paired-end pre-capture libraries were prepared according to the manufacturer's protocol (Illumina Inc, San Diego, CA) with the following modifications: 0.5 - 1ug genomic DNA in 100ul volume was sheared into fragments of approximately 300 base pairs in a Covaris E210 system (Covaris, Inc. Woburn, MA). The setting was 10% duty cycle, intensity of 4,200 cycles per burst for 120 seconds. Fragment size was checked using a 2.2% Flash Gel DNA Cassette (Lonza, Walkersville, MD, Cat. No.57023). End-repair of fragmented DNA was performed in 90ul total reaction volume containing sheared DNA, 9 ul 10X buffer, 5 ul END Repair Enzyme Mix and H2O (NEBNext End-Repair Module, New England BioLabs, Ipswich, MA, Cat. No. E6050L), incubated at 20°C for 30 minutes. A-tailing was performed in a total reaction volume of 60ul containing end-repaired DNA, 6ul 10X buffer, 3ul Klenow fragment (NEBNext dA-Tailing Module; Cat. No. E6053L) and H2O followed by incubation at 37°C for 30 minutes. Illumina multiplex adapter ligation (NEBNext Quick Ligation Module Cat. No. E6056L) was performed in a total reaction volume of 90ul containing 18ul 5X buffer, 5ul ligase, 0.5ul 100uM adaptor and H2O at room temperature for 30 minutes. After ligation, PCR with Illumina PE 1.0 and modified barcode primers (manuscript in preparation) was performed in 170ul reactions containing 85ul of 2x Phusion High-Fidelity PCR master mix, adaptor ligated DNA, 1.75ul of 50uM primers and H2O. PCR was performed using a 5-minute initial denaturation at 95°C, 6-10 cycles of 15 seconds at 95°C, 15 seconds at 60°C and 30 seconds at 72°C followed by a final extension for 5 minute at 72°C. Agencourt XP Beads (Beckman Coulter Genomics, Inc., Danvers, MA, Cat. No. A63882) were used to purify DNA after each enzymatic reaction. After purification, PCR product quantification and size distribution was determined using the Caliper GX 1K/12K/High Sensitivity Assay Labchip (Hopkinton, MA, Cat. No. 760517).

Exome capture

Illumina pre-capture libraries (1ug DNA input) were hybridized in solution to SeqCap EZ Human Exome 2.0 (Nimblegen, Madison, WI) probes targeting approximately 44Mbs of sequence from approximately 30K genes according to the manufacturer's protocol with the following modifications: hybridization enhancing oligos IHE1, IHE2 and IHE3 replaced oligos HE1.1 and HE2.1 and post-capture LM-PCR was performed using 14 cycles. Capture libraries were quantified using Caliper GX 1K/12K/High Sensitivity Assay Labchip (Hopkinton, MA, Cat. No. 760517). The efficiency of the capture was evaluated by performing a qPCR-based quality check on the built-in controls (qPCR SYBR Green assays, Applied Biosystems, Grand Island, NY). Four standardized oligo sets, RUNX2, PRKG1, SMG1, and NLK, were employed as internal quality controls. The enrichment of the capture libraries was estimated to range from 7- to 9-fold over background.

Library templates were prepared for sequencing using Illumina's cBot cluster generation system with TruSeq PE Cluster Generation Kits (Part no. PE-401-3001). Briefly, these libraries were denatured with sodium hydroxide and diluted to 6-9 pM in hybridization buffer in order to achieve a load density of ~800K clusters/mm2. Each library pool was loaded in a single lane of a HiSeq flow cell, and each lane was spiked with 2% phiX control library for run quality control. The sample libraries then underwent bridge amplification to form clonal clusters, followed by hybridization with the sequencing primer. Sequencing runs were performed in paired-end mode using the Illumina HiSeq 2000 platform. Using the TruSeq SBS Kits (Part no. FC-401-3001), sequencing-by-synthesis reactions were extended for 101 cycles from each end, with an additional 7 cycles for the index read. Sequencing runs generated approximately 300-400 million successful reads on each lane of a flow cell, with approximately 9-10 Gb produced per sample. With these sequencing yields, samples achieved an average of 95% of the targeted exome bases covered to a depth of 20X or greater.

Real Time Analysis (RTA) software was used to process the image analysis and nucleotide base calling. On average, about 80-100 million successful reads, consisting of 2X 100 bp, were generated on each lane of a flow cell.

Whole Exome Sequencing

*Protocols were performed at Baylor College of Medicine.

Mapping Reads

Illumina HiSeq bcl files were processed using BCLConvertor v1.7.1. All reads from the prepared libraries that passed the Illumina Chastity filter were formatted into fastq files. The fastq files were aligned to human reference genome build37 (NCBI) using BWA (bwa-0.5.9-R16) with default parameters with the following exceptions: seed sequence: 40 bpseed mismatch: 2, total mismatches allowed: 3. BAM files generated from alignment were preprocessed using GATK (v1.3-8-gb0e6afe) [1] to recalibrate and locally realign reads.

Mutation Detection

Sequence variants were called from tumor and matched normal BAM files using Atlas [2] an integrative variant analysis suite of tools specializing in the separation of true SNPs and insertions and deletions (indels) from sequencing and mapping errors in whole exome capture sequencing (WXS) data. The suite implements logistic regression models trained on validated WXS data to identify the true variants. ATLAS-SNP-2 (v1.3) [3] and ATLAS-Indel-2 (v0.3.1) along with Pindel (v0.2.4q) [4] were run on the BAM files producing variant data that were further filtered to remove all those observed fewer than 5 times or were present in less than 0.08 of the reads (e.g., variant allele fraction must be greater than 0.08 to undergo validation). At least one variant read of Q30 or better was required, and the variant had to lie in the central portion of the read (15% from the 5' end of the read and 20% from the 3' end). In addition, reads harboring the variant must have been observed in both forward and reverse orientations. Finally, the variant base was not observed in the normal tissue. Indels were discovered by similar processing except indels must have been observed in at least 10 of the reads.

Whole Exome Sequencing

*Protocols were performed at Baylor College of Medicine.

Library construction

Specimen processing, DNA extraction, standard QC and Illumina paired-end pre-capture libraries were prepared according to the manufacturer's protocol (Illumina Inc, San Diego, CA) with the following modifications: 0.5 - 1ug genomic DNA in 100ul volume was sheared into fragments of approximately 300 base pairs in a Covaris E210 system (Covaris, Inc. Woburn, MA). The setting was 10% duty cycle, intensity of 4,200 cycles per burst for 120 seconds. Fragment size was checked using a 2.2% Flash Gel DNA Cassette (Lonza, Walkersville, MD, Cat. No.57023). End-repair of fragmented DNA was performed in 90ul total reaction volume containing sheared DNA, 9 ul 10X buffer, 5 ul END Repair Enzyme Mix and H2O (NEBNext End-Repair Module, New England BioLabs, Ipswich, MA, Cat. No. E6050L), incubated at 20°C for 30 minutes. A-tailing was performed in a total reaction volume of 60ul containing end-repaired DNA, 6ul 10X buffer, 3ul Klenow fragment (NEBNext dA-Tailing Module; Cat. No. E6053L) and H2O followed by incubation at 37°C for 30 minutes. Illumina multiplex adapter ligation (NEBNext Quick Ligation Module Cat. No. E6056L) was performed in a total reaction volume of 90ul containing 18ul 5X buffer, 5ul ligase, 0.5ul 100uM adaptor and H2O at room temperature for 30 minutes. After ligation, PCR with Illumina PE 1.0 and modified barcode primers (manuscript in preparation) was performed in 170ul reactions containing 85ul of 2x Phusion High-Fidelity PCR master mix, adaptor ligated DNA, 1.75ul of 50uM primers and H2O. PCR was performed using a 5-minute initial denaturation at 95°C, 6-10 cycles of 15 seconds at 95°C, 15 seconds at 60°C and 30 seconds at 72°C followed by a final extension for 5 minute at 72°C. Agencourt XP Beads (Beckman Coulter Genomics, Inc., Danvers, MA, Cat. No. A63882) were used to purify DNA after each enzymatic reaction. After purification, PCR product quantification and size distribution was determined using the Caliper GX 1K/12K/High Sensitivity Assay Labchip (Hopkinton, MA, Cat. No. 760517).

Exome capture

Illumina pre-capture libraries (1ug DNA input) were hybridized in solution to SeqCap EZ Human Exome 2.0 (Nimblegen, Madison, WI) probes targeting approximately 44Mbs of sequence from approximately 30K genes according to the manufacturer's protocol with the following modifications: hybridization enhancing oligos IHE1, IHE2 and IHE3 replaced oligos HE1.1 and HE2.1 and post-capture LM-PCR was performed using 14 cycles. Capture libraries were quantified using Caliper GX 1K/12K/High Sensitivity Assay Labchip (Hopkinton, MA, Cat. No. 760517). The efficiency of the capture was evaluated by performing a qPCR-based quality check on the built-in controls (qPCR SYBR Green assays, Applied Biosystems, Grand Island, NY). Four standardized oligo sets, RUNX2, PRKG1, SMG1, and NLK, were employed as internal quality controls. The enrichment of the capture libraries was estimated to range from 7- to 9-fold over background.

Library templates were prepared for sequencing using Illumina's cBot cluster generation system with TruSeq PE Cluster Generation Kits (Part no. PE-401-3001). Briefly, these libraries were denatured with sodium hydroxide and diluted to 6-9 pM in hybridization buffer in order to achieve a load density of ~800K clusters/mm2. Each library pool was loaded in a single lane of a HiSeq flow cell, and each lane was spiked with 2% phiX control library for run quality control. The sample libraries then underwent bridge amplification to form clonal clusters, followed by hybridization with the sequencing primer. Sequencing runs were performed in paired-end mode using the Illumina HiSeq 2000 platform. Using the TruSeq SBS Kits (Part no. FC-401-3001), sequencing-by-synthesis reactions were extended for 101 cycles from each end, with an additional 7 cycles for the index read. Sequencing runs generated approximately 300-400 million successful reads on each lane of a flow cell, with approximately 9-10 Gb produced per sample. With these sequencing yields, samples achieved an average of 95% of the targeted exome bases covered to a depth of 20X or greater.

Real Time Analysis (RTA) software was used to process the image analysis and nucleotide base calling. On average, about 80-100 million successful reads, consisting of 2X 100 bp, were generated on each lane of a flow cell.

Whole Exome Sequencing

*Protocols were performed at Baylor College of Medicine.

Mapping Reads

Illumina HiSeq bcl files were processed using BCLConvertor v1.7.1. All reads from the prepared libraries that passed the Illumina Chastity filter were formatted into fastq files. The fastq files were aligned to human reference genome build37 (NCBI) using BWA (bwa-0.5.9-R16) with default parameters with the following exceptions: seed sequence: 40 bpseed mismatch: 2, total mismatches allowed: 3. BAM files generated from alignment were preprocessed using GATK (v1.3-8-gb0e6afe) [1] to recalibrate and locally realign reads.

Mutation Detection

Sequence variants were called from tumor and matched normal BAM files using Atlas [2] an integrative variant analysis suite of tools specializing in the separation of true SNPs and insertions and deletions (indels) from sequencing and mapping errors in whole exome capture sequencing (WXS) data. The suite implements logistic regression models trained on validated WXS data to identify the true variants. ATLAS-SNP-2 (v1.3) [3] and ATLAS-Indel-2 (v0.3.1) along with Pindel (v0.2.4q) [4] were run on the BAM files producing variant data that were further filtered to remove all those observed fewer than 5 times or were present in less than 0.08 of the reads (e.g., variant allele fraction must be greater than 0.08 to undergo validation). At least one variant read of Q30 or better was required, and the variant had to lie in the central portion of the read (15% from the 5' end of the read and 20% from the 3' end). In addition, reads harboring the variant must have been observed in both forward and reverse orientations. Finally, the variant base was not observed in the normal tissue. Indels were discovered by similar processing except indels must have been observed in at least 10 of the reads.

 

Whole Exome Sequencing

*Protocols were performed at Baylor College of Medicine.

Library construction

Specimen processing, DNA extraction, standard QC and Illumina paired-end pre-capture libraries were prepared according to the manufacturer's protocol (Illumina Inc, San Diego, CA) with the following modifications: 0.5 - 1ug genomic DNA in 100ul volume was sheared into fragments of approximately 300 base pairs in a Covaris E210 system (Covaris, Inc. Woburn, MA). The setting was 10% duty cycle, intensity of 4,200 cycles per burst for 120 seconds. Fragment size was checked using a 2.2% Flash Gel DNA Cassette (Lonza, Walkersville, MD, Cat. No.57023). End-repair of fragmented DNA was performed in 90ul total reaction volume containing sheared DNA, 9 ul 10X buffer, 5 ul END Repair Enzyme Mix and H2O (NEBNext End-Repair Module, New England BioLabs, Ipswich, MA, Cat. No. E6050L), incubated at 20°C for 30 minutes. A-tailing was performed in a total reaction volume of 60ul containing end-repaired DNA, 6ul 10X buffer, 3ul Klenow fragment (NEBNext dA-Tailing Module; Cat. No. E6053L) and H2O followed by incubation at 37°C for 30 minutes. Illumina multiplex adapter ligation (NEBNext Quick Ligation Module Cat. No. E6056L) was performed in a total reaction volume of 90ul containing 18ul 5X buffer, 5ul ligase, 0.5ul 100uM adaptor and H2O at room temperature for 30 minutes. After ligation, PCR with Illumina PE 1.0 and modified barcode primers (manuscript in preparation) was performed in 170ul reactions containing 85ul of 2x Phusion High-Fidelity PCR master mix, adaptor ligated DNA, 1.75ul of 50uM primers and H2O. PCR was performed using a 5-minute initial denaturation at 95°C, 6-10 cycles of 15 seconds at 95°C, 15 seconds at 60°C and 30 seconds at 72°C followed by a final extension for 5 minute at 72°C. Agencourt XP Beads (Beckman Coulter Genomics, Inc., Danvers, MA, Cat. No. A63882) were used to purify DNA after each enzymatic reaction. After purification, PCR product quantification and size distribution was determined using the Caliper GX 1K/12K/High Sensitivity Assay Labchip (Hopkinton, MA, Cat. No. 760517).

Exome capture

Illumina pre-capture libraries (1ug DNA input) were hybridized in solution to SeqCap EZ Human Exome 2.0 (Nimblegen, Madison, WI) probes targeting approximately 44Mbs of sequence from approximately 30K genes according to the manufacturer's protocol with the following modifications: hybridization enhancing oligos IHE1, IHE2 and IHE3 replaced oligos HE1.1 and HE2.1 and post-capture LM-PCR was performed using 14 cycles. Capture libraries were quantified using Caliper GX 1K/12K/High Sensitivity Assay Labchip (Hopkinton, MA, Cat. No. 760517). The efficiency of the capture was evaluated by performing a qPCR-based quality check on the built-in controls (qPCR SYBR Green assays, Applied Biosystems, Grand Island, NY). Four standardized oligo sets, RUNX2, PRKG1, SMG1, and NLK, were employed as internal quality controls. The enrichment of the capture libraries was estimated to range from 7- to 9-fold over background.

Library templates were prepared for sequencing using Illumina's cBot cluster generation system with TruSeq PE Cluster Generation Kits (Part no. PE-401-3001). Briefly, these libraries were denatured with sodium hydroxide and diluted to 6-9 pM in hybridization buffer in order to achieve a load density of ~800K clusters/mm2. Each library pool was loaded in a single lane of a HiSeq flow cell, and each lane was spiked with 2% phiX control library for run quality control. The sample libraries then underwent bridge amplification to form clonal clusters, followed by hybridization with the sequencing primer. Sequencing runs were performed in paired-end mode using the Illumina HiSeq 2000 platform. Using the TruSeq SBS Kits (Part no. FC-401-3001), sequencing-by-synthesis reactions were extended for 101 cycles from each end, with an additional 7 cycles for the index read. Sequencing runs generated approximately 300-400 million successful reads on each lane of a flow cell, with approximately 9-10 Gb produced per sample. With these sequencing yields, samples achieved an average of 95% of the targeted exome bases covered to a depth of 20X or greater.

Real Time Analysis (RTA) software was used to process the image analysis and nucleotide base calling. On average, about 80-100 million successful reads, consisting of 2X 100 bp, were generated on each lane of a flow cell.

Whole Exome Sequencing

*Protocols were performed at Baylor College of Medicine.

Mapping Reads

Illumina HiSeq bcl files were processed using BCLConvertor v1.7.1. All reads from the prepared libraries that passed the Illumina Chastity filter were formatted into fastq files. The fastq files were aligned to human reference genome build37 (NCBI) using BWA (bwa-0.5.9-R16) with default parameters with the following exceptions: seed sequence: 40 bpseed mismatch: 2, total mismatches allowed: 3. BAM files generated from alignment were preprocessed using GATK (v1.3-8-gb0e6afe) [1] to recalibrate and locally realign reads.

Mutation Detection

Sequence variants were called from tumor and matched normal BAM files using Atlas [2] an integrative variant analysis suite of tools specializing in the separation of true SNPs and insertions and deletions (indels) from sequencing and mapping errors in whole exome capture sequencing (WXS) data. The suite implements logistic regression models trained on validated WXS data to identify the true variants. ATLAS-SNP-2 (v1.3) [3] and ATLAS-Indel-2 (v0.3.1) along with Pindel (v0.2.4q) [4] were run on the BAM files producing variant data that were further filtered to remove all those observed fewer than 5 times or were present in less than 0.08 of the reads (e.g., variant allele fraction must be greater than 0.08 to undergo validation). At least one variant read of Q30 or better was required, and the variant had to lie in the central portion of the read (15% from the 5' end of the read and 20% from the 3' end). In addition, reads harboring the variant must have been observed in both forward and reverse orientations. Finally, the variant base was not observed in the normal tissue. Indels were discovered by similar processing except indels must have been observed in at least 10 of the reads.

Whole Exome Sequencing

*Protocols were performed at Baylor College of Medicine.

Library construction

Specimen processing, DNA extraction, standard QC and Illumina paired-end pre-capture libraries were prepared according to the manufacturer's protocol (Illumina Inc, San Diego, CA) with the following modifications: 0.5 - 1ug genomic DNA in 100ul volume was sheared into fragments of approximately 300 base pairs in a Covaris E210 system (Covaris, Inc. Woburn, MA). The setting was 10% duty cycle, intensity of 4,200 cycles per burst for 120 seconds. Fragment size was checked using a 2.2% Flash Gel DNA Cassette (Lonza, Walkersville, MD, Cat. No.57023). End-repair of fragmented DNA was performed in 90ul total reaction volume containing sheared DNA, 9 ul 10X buffer, 5 ul END Repair Enzyme Mix and H2O (NEBNext End-Repair Module, New England BioLabs, Ipswich, MA, Cat. No. E6050L), incubated at 20°C for 30 minutes. A-tailing was performed in a total reaction volume of 60ul containing end-repaired DNA, 6ul 10X buffer, 3ul Klenow fragment (NEBNext dA-Tailing Module; Cat. No. E6053L) and H2O followed by incubation at 37°C for 30 minutes. Illumina multiplex adapter ligation (NEBNext Quick Ligation Module Cat. No. E6056L) was performed in a total reaction volume of 90ul containing 18ul 5X buffer, 5ul ligase, 0.5ul 100uM adaptor and H2O at room temperature for 30 minutes. After ligation, PCR with Illumina PE 1.0 and modified barcode primers (manuscript in preparation) was performed in 170ul reactions containing 85ul of 2x Phusion High-Fidelity PCR master mix, adaptor ligated DNA, 1.75ul of 50uM primers and H2O. PCR was performed using a 5-minute initial denaturation at 95°C, 6-10 cycles of 15 seconds at 95°C, 15 seconds at 60°C and 30 seconds at 72°C followed by a final extension for 5 minute at 72°C. Agencourt XP Beads (Beckman Coulter Genomics, Inc., Danvers, MA, Cat. No. A63882) were used to purify DNA after each enzymatic reaction. After purification, PCR product quantification and size distribution was determined using the Caliper GX 1K/12K/High Sensitivity Assay Labchip (Hopkinton, MA, Cat. No. 760517).

Exome capture

Illumina pre-capture libraries (1ug DNA input) were hybridized in solution to SeqCap EZ Human Exome 2.0 (Nimblegen, Madison, WI) probes targeting approximately 44Mbs of sequence from approximately 30K genes according to the manufacturer's protocol with the following modifications: hybridization enhancing oligos IHE1, IHE2 and IHE3 replaced oligos HE1.1 and HE2.1 and post-capture LM-PCR was performed using 14 cycles. Capture libraries were quantified using Caliper GX 1K/12K/High Sensitivity Assay Labchip (Hopkinton, MA, Cat. No. 760517). The efficiency of the capture was evaluated by performing a qPCR-based quality check on the built-in controls (qPCR SYBR Green assays, Applied Biosystems, Grand Island, NY). Four standardized oligo sets, RUNX2, PRKG1, SMG1, and NLK, were employed as internal quality controls. The enrichment of the capture libraries was estimated to range from 7- to 9-fold over background.

Library templates were prepared for sequencing using Illumina's cBot cluster generation system with TruSeq PE Cluster Generation Kits (Part no. PE-401-3001). Briefly, these libraries were denatured with sodium hydroxide and diluted to 6-9 pM in hybridization buffer in order to achieve a load density of ~800K clusters/mm2. Each library pool was loaded in a single lane of a HiSeq flow cell, and each lane was spiked with 2% phiX control library for run quality control. The sample libraries then underwent bridge amplification to form clonal clusters, followed by hybridization with the sequencing primer. Sequencing runs were performed in paired-end mode using the Illumina HiSeq 2000 platform. Using the TruSeq SBS Kits (Part no. FC-401-3001), sequencing-by-synthesis reactions were extended for 101 cycles from each end, with an additional 7 cycles for the index read. Sequencing runs generated approximately 300-400 million successful reads on each lane of a flow cell, with approximately 9-10 Gb produced per sample. With these sequencing yields, samples achieved an average of 95% of the targeted exome bases covered to a depth of 20X or greater.

Real Time Analysis (RTA) software was used to process the image analysis and nucleotide base calling. On average, about 80-100 million successful reads, consisting of 2X 100 bp, were generated on each lane of a flow cell.

Whole Exome Sequencing

*Protocols were performed at Baylor College of Medicine.

Mapping Reads

Illumina HiSeq bcl files were processed using BCLConvertor v1.7.1. All reads from the prepared libraries that passed the Illumina Chastity filter were formatted into fastq files. The fastq files were aligned to human reference genome build37 (NCBI) using BWA (bwa-0.5.9-R16) with default parameters with the following exceptions: seed sequence: 40 bpseed mismatch: 2, total mismatches allowed: 3. BAM files generated from alignment were preprocessed using GATK (v1.3-8-gb0e6afe) [1] to recalibrate and locally realign reads.

Mutation Detection

Sequence variants were called from tumor and matched normal BAM files using Atlas [2] an integrative variant analysis suite of tools specializing in the separation of true SNPs and insertions and deletions (indels) from sequencing and mapping errors in whole exome capture sequencing (WXS) data. The suite implements logistic regression models trained on validated WXS data to identify the true variants. ATLAS-SNP-2 (v1.3) [3] and ATLAS-Indel-2 (v0.3.1) along with Pindel (v0.2.4q) [4] were run on the BAM files producing variant data that were further filtered to remove all those observed fewer than 5 times or were present in less than 0.08 of the reads (e.g., variant allele fraction must be greater than 0.08 to undergo validation). At least one variant read of Q30 or better was required, and the variant had to lie in the central portion of the read (15% from the 5' end of the read and 20% from the 3' end). In addition, reads harboring the variant must have been observed in both forward and reverse orientations. Finally, the variant base was not observed in the normal tissue. Indels were discovered by similar processing except indels must have been observed in at least 10 of the reads.

Whole Exome Sequencing

*Protocols were performed at Baylor College of Medicine.

Library construction

Specimen processing, DNA extraction, standard QC and Illumina paired-end pre-capture libraries were prepared according to the manufacturer's protocol (Illumina Inc, San Diego, CA) with the following modifications: 0.5 - 1ug genomic DNA in 100ul volume was sheared into fragments of approximately 300 base pairs in a Covaris E210 system (Covaris, Inc. Woburn, MA). The setting was 10% duty cycle, intensity of 4,200 cycles per burst for 120 seconds. Fragment size was checked using a 2.2% Flash Gel DNA Cassette (Lonza, Walkersville, MD, Cat. No.57023). End-repair of fragmented DNA was performed in 90ul total reaction volume containing sheared DNA, 9 ul 10X buffer, 5 ul END Repair Enzyme Mix and H2O (NEBNext End-Repair Module, New England BioLabs, Ipswich, MA, Cat. No. E6050L), incubated at 20°C for 30 minutes. A-tailing was performed in a total reaction volume of 60ul containing end-repaired DNA, 6ul 10X buffer, 3ul Klenow fragment (NEBNext dA-Tailing Module; Cat. No. E6053L) and H2O followed by incubation at 37°C for 30 minutes. Illumina multiplex adapter ligation (NEBNext Quick Ligation Module Cat. No. E6056L) was performed in a total reaction volume of 90ul containing 18ul 5X buffer, 5ul ligase, 0.5ul 100uM adaptor and H2O at room temperature for 30 minutes. After ligation, PCR with Illumina PE 1.0 and modified barcode primers (manuscript in preparation) was performed in 170ul reactions containing 85ul of 2x Phusion High-Fidelity PCR master mix, adaptor ligated DNA, 1.75ul of 50uM primers and H2O. PCR was performed using a 5-minute initial denaturation at 95°C, 6-10 cycles of 15 seconds at 95°C, 15 seconds at 60°C and 30 seconds at 72°C followed by a final extension for 5 minute at 72°C. Agencourt XP Beads (Beckman Coulter Genomics, Inc., Danvers, MA, Cat. No. A63882) were used to purify DNA after each enzymatic reaction. After purification, PCR product quantification and size distribution was determined using the Caliper GX 1K/12K/High Sensitivity Assay Labchip (Hopkinton, MA, Cat. No. 760517).

Exome capture

Illumina pre-capture libraries (1ug DNA input) were hybridized in solution to SeqCap EZ Human Exome 2.0 (Nimblegen, Madison, WI) probes targeting approximately 44Mbs of sequence from approximately 30K genes according to the manufacturer's protocol with the following modifications: hybridization enhancing oligos IHE1, IHE2 and IHE3 replaced oligos HE1.1 and HE2.1 and post-capture LM-PCR was performed using 14 cycles. Capture libraries were quantified using Caliper GX 1K/12K/High Sensitivity Assay Labchip (Hopkinton, MA, Cat. No. 760517). The efficiency of the capture was evaluated by performing a qPCR-based quality check on the built-in controls (qPCR SYBR Green assays, Applied Biosystems, Grand Island, NY). Four standardized oligo sets, RUNX2, PRKG1, SMG1, and NLK, were employed as internal quality controls. The enrichment of the capture libraries was estimated to range from 7- to 9-fold over background.

Library templates were prepared for sequencing using Illumina's cBot cluster generation system with TruSeq PE Cluster Generation Kits (Part no. PE-401-3001). Briefly, these libraries were denatured with sodium hydroxide and diluted to 6-9 pM in hybridization buffer in order to achieve a load density of ~800K clusters/mm2. Each library pool was loaded in a single lane of a HiSeq flow cell, and each lane was spiked with 2% phiX control library for run quality control. The sample libraries then underwent bridge amplification to form clonal clusters, followed by hybridization with the sequencing primer. Sequencing runs were performed in paired-end mode using the Illumina HiSeq 2000 platform. Using the TruSeq SBS Kits (Part no. FC-401-3001), sequencing-by-synthesis reactions were extended for 101 cycles from each end, with an additional 7 cycles for the index read. Sequencing runs generated approximately 300-400 million successful reads on each lane of a flow cell, with approximately 9-10 Gb produced per sample. With these sequencing yields, samples achieved an average of 95% of the targeted exome bases covered to a depth of 20X or greater.

Real Time Analysis (RTA) software was used to process the image analysis and nucleotide base calling. On average, about 80-100 million successful reads, consisting of 2X 100 bp, were generated on each lane of a flow cell.

Whole Exome Sequencing

*Protocols were performed at Baylor College of Medicine.

Mapping Reads

Illumina HiSeq bcl files were processed using BCLConvertor v1.7.1. All reads from the prepared libraries that passed the Illumina Chastity filter were formatted into fastq files. The fastq files were aligned to human reference genome build37 (NCBI) using BWA (bwa-0.5.9-R16) with default parameters with the following exceptions: seed sequence: 40 bpseed mismatch: 2, total mismatches allowed: 3. BAM files generated from alignment were preprocessed using GATK (v1.3-8-gb0e6afe) [1] to recalibrate and locally realign reads.

Mutation Detection

Sequence variants were called from tumor and matched normal BAM files using Atlas [2] an integrative variant analysis suite of tools specializing in the separation of true SNPs and insertions and deletions (indels) from sequencing and mapping errors in whole exome capture sequencing (WXS) data. The suite implements logistic regression models trained on validated WXS data to identify the true variants. ATLAS-SNP-2 (v1.3) [3] and ATLAS-Indel-2 (v0.3.1) along with Pindel (v0.2.4q) [4] were run on the BAM files producing variant data that were further filtered to remove all those observed fewer than 5 times or were present in less than 0.08 of the reads (e.g., variant allele fraction must be greater than 0.08 to undergo validation). At least one variant read of Q30 or better was required, and the variant had to lie in the central portion of the read (15% from the 5' end of the read and 20% from the 3' end). In addition, reads harboring the variant must have been observed in both forward and reverse orientations. Finally, the variant base was not observed in the normal tissue. Indels were discovered by similar processing except indels must have been observed in at least 10 of the reads.

Whole Exome Sequencing

*Protocols were performed at Baylor College of Medicine.

Library construction

Specimen processing, DNA extraction, standard QC and Illumina paired-end pre-capture libraries were prepared according to the manufacturer's protocol (Illumina Inc, San Diego, CA) with the following modifications: 0.5 - 1ug genomic DNA in 100ul volume was sheared into fragments of approximately 300 base pairs in a Covaris E210 system (Covaris, Inc. Woburn, MA). The setting was 10% duty cycle, intensity of 4,200 cycles per burst for 120 seconds. Fragment size was checked using a 2.2% Flash Gel DNA Cassette (Lonza, Walkersville, MD, Cat. No.57023). End-repair of fragmented DNA was performed in 90ul total reaction volume containing sheared DNA, 9 ul 10X buffer, 5 ul END Repair Enzyme Mix and H2O (NEBNext End-Repair Module, New England BioLabs, Ipswich, MA, Cat. No. E6050L), incubated at 20°C for 30 minutes. A-tailing was performed in a total reaction volume of 60ul containing end-repaired DNA, 6ul 10X buffer, 3ul Klenow fragment (NEBNext dA-Tailing Module; Cat. No. E6053L) and H2O followed by incubation at 37°C for 30 minutes. Illumina multiplex adapter ligation (NEBNext Quick Ligation Module Cat. No. E6056L) was performed in a total reaction volume of 90ul containing 18ul 5X buffer, 5ul ligase, 0.5ul 100uM adaptor and H2O at room temperature for 30 minutes. After ligation, PCR with Illumina PE 1.0 and modified barcode primers (manuscript in preparation) was performed in 170ul reactions containing 85ul of 2x Phusion High-Fidelity PCR master mix, adaptor ligated DNA, 1.75ul of 50uM primers and H2O. PCR was performed using a 5-minute initial denaturation at 95°C, 6-10 cycles of 15 seconds at 95°C, 15 seconds at 60°C and 30 seconds at 72°C followed by a final extension for 5 minute at 72°C. Agencourt XP Beads (Beckman Coulter Genomics, Inc., Danvers, MA, Cat. No. A63882) were used to purify DNA after each enzymatic reaction. After purification, PCR product quantification and size distribution was determined using the Caliper GX 1K/12K/High Sensitivity Assay Labchip (Hopkinton, MA, Cat. No. 760517).

Exome capture

Illumina pre-capture libraries (1ug DNA input) were hybridized in solution to SeqCap EZ Human Exome 2.0 (Nimblegen, Madison, WI) probes targeting approximately 44Mbs of sequence from approximately 30K genes according to the manufacturer's protocol with the following modifications: hybridization enhancing oligos IHE1, IHE2 and IHE3 replaced oligos HE1.1 and HE2.1 and post-capture LM-PCR was performed using 14 cycles. Capture libraries were quantified using Caliper GX 1K/12K/High Sensitivity Assay Labchip (Hopkinton, MA, Cat. No. 760517). The efficiency of the capture was evaluated by performing a qPCR-based quality check on the built-in controls (qPCR SYBR Green assays, Applied Biosystems, Grand Island, NY). Four standardized oligo sets, RUNX2, PRKG1, SMG1, and NLK, were employed as internal quality controls. The enrichment of the capture libraries was estimated to range from 7- to 9-fold over background.

Library templates were prepared for sequencing using Illumina's cBot cluster generation system with TruSeq PE Cluster Generation Kits (Part no. PE-401-3001). Briefly, these libraries were denatured with sodium hydroxide and diluted to 6-9 pM in hybridization buffer in order to achieve a load density of ~800K clusters/mm2. Each library pool was loaded in a single lane of a HiSeq flow cell, and each lane was spiked with 2% phiX control library for run quality control. The sample libraries then underwent bridge amplification to form clonal clusters, followed by hybridization with the sequencing primer. Sequencing runs were performed in paired-end mode using the Illumina HiSeq 2000 platform. Using the TruSeq SBS Kits (Part no. FC-401-3001), sequencing-by-synthesis reactions were extended for 101 cycles from each end, with an additional 7 cycles for the index read. Sequencing runs generated approximately 300-400 million successful reads on each lane of a flow cell, with approximately 9-10 Gb produced per sample. With these sequencing yields, samples achieved an average of 95% of the targeted exome bases covered to a depth of 20X or greater.

Real Time Analysis (RTA) software was used to process the image analysis and nucleotide base calling. On average, about 80-100 million successful reads, consisting of 2X 100 bp, were generated on each lane of a flow cell.

Whole Exome Sequencing

*Protocols were performed at Baylor College of Medicine.

Mapping Reads

Illumina HiSeq bcl files were processed using BCLConvertor v1.7.1. All reads from the prepared libraries that passed the Illumina Chastity filter were formatted into fastq files. The fastq files were aligned to human reference genome build37 (NCBI) using BWA (bwa-0.5.9-R16) with default parameters with the following exceptions: seed sequence: 40 bpseed mismatch: 2, total mismatches allowed: 3. BAM files generated from alignment were preprocessed using GATK (v1.3-8-gb0e6afe) [1] to recalibrate and locally realign reads.

Mutation Detection

Sequence variants were called from tumor and matched normal BAM files using Atlas [2] an integrative variant analysis suite of tools specializing in the separation of true SNPs and insertions and deletions (indels) from sequencing and mapping errors in whole exome capture sequencing (WXS) data. The suite implements logistic regression models trained on validated WXS data to identify the true variants. ATLAS-SNP-2 (v1.3) [3] and ATLAS-Indel-2 (v0.3.1) along with Pindel (v0.2.4q) [4] were run on the BAM files producing variant data that were further filtered to remove all those observed fewer than 5 times or were present in less than 0.08 of the reads (e.g., variant allele fraction must be greater than 0.08 to undergo validation). At least one variant read of Q30 or better was required, and the variant had to lie in the central portion of the read (15% from the 5' end of the read and 20% from the 3' end). In addition, reads harboring the variant must have been observed in both forward and reverse orientations. Finally, the variant base was not observed in the normal tissue. Indels were discovered by similar processing except indels must have been observed in at least 10 of the reads.

Whole Exome Sequencing

*Protocols were performed at the Broad Institute.  Please reference Pugh et al. (Published in final edited form as:Nat Genet. 2013 Mar; 45(3): 279–284).

The generation, sequencing, and analysis of 222 pairs of exome libraries at the Broad Institute was performed using a previously described protocol. Due to the small quantities of DNA available, 81 DNA samples were amplified using Phi29-based multiple-strand displacement whole genome amplification (Repli-g service, QIAgen). Exonic regions were captured by in-solution hybridization using RNA baits similar to those described but supplemented with additional probes capturing additional genes listed in ReqSeq in addition to the original Consensus Coding Sequence (CCDS) set. In total, ~33 Mb of genomic sequence was targeted, consisting of 193,094 exons from 18,863 genes annotated by the CCDS and RefSeq databases as coding for protein or micro-RNA (accessed November 2010). Sequencing of 76 bp paired-end reads was performed using Illumina Genome Analyzer IIx and HiSeq 2000 instruments. Reads were aligned to the hg19/GRCh37 build of the reference human genome sequenceusing BWA. PCR duplicates were flagged in the bam files for exclusion from further analysis using the Picard MarkDuplicates tool. To confirm sample identity, copy number profiles derived from sequence data were compared with those derived from microarray data when available. Candidate somatic base substitutions were detected using muTect (previously referred to as muTector) and insertions and deletions were detected using IndelGenotyper. Segmental copy number ratios were calculated as the ratio of tumor fraction read-depth to the average fractional read-depth observed in normal samples for that region.

Removal of oxoG library preparation artifact

Cases sequenced using WGA and native DNA were sequenced more than eight months apart by the Sequencing Platform at the Broad Institute. Initial comparison of candidate mutation calls from these two data sets identified a preponderance of apparent G>T or C>A substitutions of low allele fraction (<0.15) and within specific sequence contexts (Supplementary Figure 2A). We subsequently characterized this artifact and developed a method to detect and remove these events. In brief, these artifacts are introduced at the DNA shearing step of the library construction process and arise from the oxidation of guanine bases (oxoG) by high-energy sonication. During downstream PCR, oxoG bases preferentially pair with thymine rather than cytosine, resulting in apparent G>T or C>A substitutions of low allele fraction and enriched within specific sequence contexts (Supplementary Figure 2B). Consistent with this mechanism, the intensity of the sonication process was increased with the introduction of a new 150 bp shearing protocol between preparation of the WGA and native DNA samples.

The number of artifacts in a library was apparently sample-dependent (Supplementary Figure 2C) and these events were found in unmatched tumor and normal libraries. In some cases, thousands of candidate mutations were called in cases with a heavily affected tumor sample and an unaffected normal. However, nearly every sample had at least one such artifact and we have observed similar events in publically available data sets from other centers, suggesting a common artifact mode that was exacerbated in some of our samples. To address this problem, we devised a method to differentiate oxoG artifacts from bona fide mutations.

Due to the modification of only one strand of a G:C base-pair (i.e. only the G base), reads supporting the artifact have characteristic read-orientation conferred upon adapter ligation. Therefore, all reads supporting an artifact were almost exclusively derived from the first or second read of the Illumina HiSeq instrument. Bona fide variants are supported by near-equal numbers of first and second reads. We made use of the skewed read-orientation combinations and low allele fractions characteristic of this artifact to identify and remove oxoG artifacts from mutation calls in our cohort (i.e. removal of all variants with allele fraction <0.1 or exclusively supported by a single read orientation). 

Whole Exome Sequencing

*Protocols were performed at the Broad Institute. Please reference Pugh et al. (Published in final edited form as:Nat Genet. 2013 Mar; 45(3): 279–284).

Verification of somatic mutations and rearrangements

We used a combination of genotyping and sequencing technologies to verify random candidate mutations (PCR/Sanger and PCR/HiSeq sequencing of candidates from Complete Genomics and BC Cancer Agency Illumina WGS and RNA-seq data), as well as mutations supportive of our significance analyses (Sequenom and PCR/MiSeq of WES and WGS data). Combining all of the validation experiments resulted in overall validation rates of 87% for substitutions (525/605 candidates, 241/282 coding) and 34% for indels (27/79 candidates, 26/41 coding). Some mutations were verified using multiple technologies and therefore the total number of candidate mutations verified is lower than the sum total of mutations described in the Supplementary Note. See Supplementary Note for details and cross-platform comparisons.

Integrated analysis of somatic variation from exome and genome data sets

Somatic mutations detected in WGS, WES, and RNA-seq data sets were annotated using Oncotator (See Broad Institute Cancer Genome Analysis webpage). Genes mutated at a statistically significant frequency were identified using MutSig, a method that identifies genes with mutation frequencies greater than expected by chance, given detected background mutation rates, gene length and callable sequence in each tumor/normal pair. The relationship between mutation frequency and age of diagnosis was tested using the Spearman rank test. The implementation of the Kolmogorov-Smirnov test in R version 2.11.1 (ks.test) was used to test differences in mutation frequency distributions of several clinical variables (Supplementary Table 4).

Germline variant analysis

Detection of pathogenic germline variation at base-pair resolution in a cohort of cancer patients is complicated by selection of an appropriately matched and sized control population, relatively high carrier frequencies for unrelated disorders, and complex genetics underlying cancer predisposition. To nominate germline variants predisposing to neuroblastoma, we searched for enrichment of putative functional variants in the blood-derived DNA samples from our WES cohort compared to normal DNAs from 1,974 European American individuals sequenced by the National Heart, Lung, and Blood Institute Grand Opportunity Exome Sequencing Project (ESP). As indel calls from the ESP cohort were not publically available at the time of our study, we did not include them in our analysis.

To ensure consistency and accuracy of germline variant detection, all neuroblastoma WES cases were called simultaneously with 800 WES cases from the 1000Genomes project using the UnifiedGenotyper from the Genome Analysis Toolkit. A principal component analysis of the genotype calls was performed to determine the ethnic background of our cases (Supplementary Figure 7) with respect to three 1000Genomes populations. As over 80% of our cohort was Caucasian or ad-mixed Caucasian, we downloaded genotyping calls and coverage information from 1,974 European American individuals available on the ESP website to serve as a control population. To focus our analysis on rare variation consistent with the low prevalence of neuroblastoma, we removed from both data sets all variants present in individuals sequenced as part of the 1000 Genomes project. Next, we generated two lists of rare variants: overlaps with clinically-reported variants recorded in ClinVar (downloaded 4/27/2012, 284 variants in neuroblastoma, 2,947 in ESP) and loss-of-function variants in any of 924 genes listed in the Cancer Gene Census, Familial Cancer database, or a list of DNA repair genes (86 neuroblastoma, 1,068 ESP). We then tested each gene for significant enrichment of variants in the neuroblastoma compared to the ESP cohort (1-tailed Fisher’s exact test, Supplementary Tables 7 and 8).

The germline ClinVar analysis uncovered four genes of significance driven by single variants seen at greater frequency in neuroblastoma compared to ESP: CYP2D6, NOD2, SLC34A3, and HPD. All of these variants are present at low frequency in an expanded European American ESP cohort (rs5030865 in 1/8,524 chromosomes, rs104895438 in 5/8600, rs121918239 in 14/8514, and rs137852868 in 11/8600), suggesting they are benign polymorphisms. Note that, while candidates detected by this approach are not significant after correction for multiple testing, we believe there is sufficient biological rationale and supporting evidence for validation in larger cohorts. We also looked for overlap with sites recorded in COSMIC. This analysis identified a TP53 variant associated with Li-Fraumeni syndrome.

Whole Exome Sequencing

*Protocols were performed at St. Jude Children’s Research Hospital.

Library construction utilized DNA tagmentation (fragmentation and adapter attachment) performed using the reagent provided in the Illumina Nextera rapid exome kit (version 1.2) and was performed using the Caliper Biosciences (Perking Elmer) Sciclone G3. First-round PCR (10 cycles) was performed using Illumina Nextera kit v1.2 reagents, and clean-up steps employ BC/Agencourt AMPure XP beads. Target capture utilized Illumina Nextera rapid capture exome kit v1.2 and supplied hybridization and associated reagents. The pre-hybridization pool size was 12 samples, and second round PCR (10 cycles) performed with Nextera kit v1.2 reagents. Library quality control was performed using a Victor fluorescence plate reader with Quant-it dsDNA reagents for pre-pool quantitation, and Agilent Bio-analyzer 2200 for final library quantitation. Paired-end sequencing was performed using Illumina HiSeq 2500 with read length 100 bp.

Whole Exome Sequencing

*Protocols were performed at St. Jude Children’s Research Hospital.

Paired-end WXS data were aligned to the human reference genome GRCh37 by BWA1 (version 0.7.12). Samtools2 (version 1.3.1) were used to generate chromosomal coordinate-sorted and indexed bam files, and then Picard (version 1.129) MarkDuplicates module was used for marking PCR duplication.

SNV/indel calling and filter workflow. The GATK UnifiedGenotyper module was used to identify SNVs and indels from leukemia and germline samples, which were filtered by a homemade pipeline, excluding: 1) reported common SNPs/indels from UCSC dbSNP v142; 2) germline mutations detected from matched germline control samples. All the non-silent SNVs/indels yield from the filtering pipeline were manually reviewed and only the highly reliable somatic ones were reported. Meanwhile, adjacent nucleotide changes on the same allele were merged into a single mutation.

For patients with flow sorted subpopulations of leukemia cells sequenced, the mutation calling for each population was performed de novo. Mutations detected from some/one of the samples were checked across the other samples from the same patient.

References

  1. Li H, et al. (2009). Fast and accurate short read alignment with Burrows-Wheeler transform. Bioinformatics. 25(14):1754-60. (PMID: 19451168)
  2. Li H, et al. (2009). The Sequence Alignment/Map format and SAMtools. Bioinformatics. 25(16):2078-9. (PMID: 19505943)
Last updated: May 30, 2019