Introduction
Sequencing by hybridization to oligonucleotides has evolved into an inexpensive, reliable and fast technology for targeted sequencing. Hundreds of human genes can now be sequenced within a day using a single hybridization to a resequencing microarray. However, several issues inherent to these arrays (e.g., crosshybridization, variable probe/target affinity) cause sequencing errors and have prevented more widespread applications. We developed an R package for resequencing microarray data analysis that integrates a novel statistical algorithm, sequence robust multi-array analysis (SRMA), for rare variant detection with high sensitivity (false negative rate, FNR 5%) and accuracy (false positive rate, FPR <1X10-5). The SRMA package consists of five modules for quality control, data normalization, single array analysis, multi-array analysis and output analysis. The entire workflow is efficient and identifies rare DNA single nucleotide variations (SNVs) and structural changes such as gene deletions with high accuracy and sensitivity.
Reference
Wang W, Shen P, Thyagarajan S, Lin S, Palm C, Horvath R, Klopstock T, Cutler D, Pique L, Schrijver I, Davis RW, Mindrinos M, Speed TP, Scharfe C. Identification of Rare DNA Variants in Mitochondrial Disorders with Improved Array-based Resequencing. Nucleic Acids Research 2010 Sep 15 doi:10.1093/nar/gkq750/.
Zhang N, Xu Y, O'Hely M, Speed TP, Scharfe C and Wang W. SRMA: an R package for resequencing array data
analysis. Bioinformatics 2012;doi: 10.1093/bioinformatics/bts286
Version
Version 1.0.0: SRMA_1.0.0-2.tar.gz. SRMA is also available on CRAN.
Installation in R environment (version 2.15.0):
>source("http://bioconductor.org/biocLite.R")
>biocLite(c("aroma.light","affxparser"))
>install.packages("SRMA")
Notes: Depending on what R libraries had been installed on the individual computer, the actual installation procedure might be different. If errors occur saying certain library is missing, try either biocLite(“library name”) or install.packages(“library name”).