The genetic diagnosis of non-syndromic neurological disease presents unique challenges to clinicians and diagnostic labs: recognizable phenotypes are genetically heterogeneous with many dozens of genes causally implicated, mutations in each gene only explain a negligible percentage of patient cases, and a complex and rapidly developing literature requires constant re-evaluation of the diagnostic strategy. We are pursuing a highly multiplexed approach to the diagnosis of non-syndromic intellectual disability, developmental delay and brain malformations and are evaluating three different enrichment approaches: multiplexed target amplification using the Raindance system, targeted capture using Agilent SureSelect biotinylated RNA baits, and Agilent exome capture. Our region of interest includes 1627 exons from 98 genes, spanning 500,000 bases of coding sequence. For each of 12 samples enriched using the Raindance system, we generated ~16 million 76bp paired-end reads on an Illumina GAIIx. On average, 92% of reads aligned to the human genome, with 25-45% deriving from the target region. With an average depth of 150x, more than 94% was covered at 30x leaving 260 exons (out of 1627) with some small regions insufficiently covered. On average, we identified 250 variants per sample within this region that included 25 missense variants (two to five of which were novel), and 0 or 1 small coding deletions per patient. We successfully identified a causative single base deletion in the AGTR2 gene in a patient with intellectual disability and early infantile seizures in whom extensive single-gene testing had previously been performed with no diagnostic success. We have also identified a number of potentially deleterious mutations in the remaining patients, and validation is currently underway. A subset of these same samples have been enriched for our genes of interest using a custom Agilent SureSelect design and by using Agilent Exome Capture, and comparative data analysis will be presented. We propose that reliable integration of next-generation sequencing into the clinical arena requires sufficient data generation to yield 100% coverage of the target region defined as a minimum coverage of 30x at each base to minimize false positive and false negative rates; and that the target region needs to be supplemented by Sanger-sequencing where it fails to meet those requirements.
By: Scott Topper, Viswateja Nelakuditi, Melissa A. Dempsey, Soma Das