SHORT REPORT SNVerGUI: a desktop tool for variant analysis of next-generation sequencing data Wei Wang, 1 Weicheng Hu, 1 Fang Hou, 1 Pingzhao Hu, 2 Zhi Wei 1 Additional supplementary les are published online only. To view these les please visit the journal online (http://dx.doi. org/10.1136/jmedgenet-2012- 101001). 1 Department of Computer Science, New Jersey Institute of Technology, Newark, New Jersey, USA 2 The Centre for Applied Genomics, The Hospital for Sick Children, Toronto, Ontario, Canada Correspondence to Dr Zhi Wei, Department of Computer Science, New Jersey Institute of Technology, GITC 4400, Newark, NJ 07102, USA; zhiwei@njit.edu Received 24 April 2012 Revised 21 August 2012 Accepted 22 August 2012 Published Online First 28 September 2012 ABSTRACT Background Advances in next generation sequencing (NGS) technology have made it possible to interrogate comprehensively genome-wide genetic variations. However, most existing tools for variation detection are based on command-line interface, which discourages the main end users of NGS data, such as biologists, geneticists and clinicians, from utilising the software. Method and Results We have developed the SNVerGUI, a graphical user interface (GUI) based tool for variant detection and analysis. Compared with other methods for variant calling, our approach is unique in that it is applicable to both individual and pooled sequencing data. With friendly GUI, end users can easily adjust running parameters to optimise variant calling for their specic needs. SNVerGUI supports commonly used input and output le formats that allows SNVerGUI to be seamlessly integrated into common NGS data analysis pipelines. SNVerGUI is implemented in Java, which is platform-independent and therefore easy to install and run on the commonly used operating systems, such as Linux, Mac, and Windows. Using two real datasets, we have shown that SNVerGUI is capable of analysing very high volume NGS data in a feasible time on personal computers. Conclusions SNVerGUI is a fast and easy desktop GUI tool for the identication of genomic variants from pooled sequencing and individual sequencing data. Using this software, users can perform sophisticated variant detection by simply conguring several parameters in a friendly graphical user interface. SNVerGUI makes variant analysis as simple and effortless as possible, and we expect it to become popular among geneticists, clinicians, and biologists. SNVerGUI can be freely downloaded from http://snver. sourceforge.net/snvergui/, and will be continuously updated upon usersfeedback. Since the complete sequencing of the human genome, DNA sequencing cost has dramatically plummeted by 100 000-fold over the past decade, which allows for a wide spectrum of applications using next generation sequencing (NGS) data. 1 For medical genetics, analysis of NGS data has led to the identication of causal genetic factors under- lying numerous Mendelian disorders. 2 3 It has become increasingly cost effective to use NGS in the clinical diagnosis of Mendelian diseases with known genetic mutations. Identication of genetic variants from NGS data is therefore particularly important. Most existing NGS variant calling tools 4 5 provide only command-line interfaces. Typically, users must execute these tools and sometimes apply additional lters from the command line. This may discourage biologists, geneticists, clini- cians, and other end users who often lack the pro- gramming expertise to allow them to easily apply non-graphical user interface (GUI) tools. Quite a few GUI based variant calling softwares have been developed for addressing this concern. 68 However, they are not adapted to detecting variants from pooled sequencing data, which account for a sizable proportion of current NGS studies. 911 This motivates us to implement the SNVerGUI, a GUI based desktop environment, in order to exploit the unique merits of our recently developed statistical tool SNVer 12 for detecting SNVs and indels from both pooled and individual NGS data. The pipeline of SNVerGUI is illustrated in gure 1. Its new and key merits are highlighted as follows. 1. Compared with its previous command-line version, 12 SNVerGUI adds three new features. First, it can estimate locus specic sequencing error from data (see supplementary method), and so users do not need to specify this critical parameter. Second, SNVerGUI can call indel variants (see supplementary method). Third, variant outputs in variant call format (VCF) can be directed to the user friendly web version of the popular annotation tool wANNOVAR 13 for delineating their functional consequences. 2. SNVerGUI is applicable to both individual and pooled NGS data by using a unied binomial- binomial statistical model. It can handle single pool NGS data, which cannot be processed by most, if not all, existing state-of-the-art tools. To our knowledge, SNVerGUI will be rst GUI tool for calling variants from pooled sequencing data. 3. SNVerGUI supports widely used input and output le formats. SNVerGUI accepts aligned read data and reference sequence data in popular le formats, such as .fasta, .Sequence Alignment/Map (SAM) and .Binary version of SAM (BAM) les. Variant detection results are outputted in the comma-separated values (CSV) format that can be directly opened by Excel. They are also outputted in the standard VCF 14 that can be accepted by other powerful tools as input, for example, VarSifter 15 for lter- ing and ANNOVAR 16 for annotation. 4. SNVerGUI provides exible interactive post-call processing. Analysis results are displayed in easy-to-analyse table views that support sorting variants by p value, sequencing depth, allele fre- quency, etc. Users can easily customise the J Med Genet 2012;49:753755. doi:10.1136/jmedgenet-2012-101001 753 Methods group.bmj.com on November 30, 2012 - Published by jmg.bmj.com Downloaded from