ParaHaplo is developing parallel computing tools for genome-wide association studies.
システム要件
Background: Since more than a million single-nucleotide polymorphisms (SNPs) are analyzed in
any given genome-wide association study (GWAS), performing multiple comparisons can be
problematic. To cope with multiple-comparison problems in GWAS, haplotype-based algorithms
were developed to correct for multiple comparisons at multiple SNP loci in linkage disequilibrium.
A permutation test can also control problems inherent in multiple testing; however, both the
calculation of exact probability and the execution of permutation tests are time-consuming. Faster
methods for calculating exact probabilities and executing permutation tests are required.
Methods: We developed a set of computer programs for the parallel computation of accurate Pvalues
in haplotype-based GWAS. Our program, ParaHaplo, is intended for workstation clusters
using the Intel Message Passing Interface (MPI). We compared the performance of our algorithm
to that of the regular permutation test on JPT and CHB of HapMap.
Results: ParaHaplo can detect smaller differences between 2 populations than SNP-based GWAS.
We also found that parallel-computing techniques made ParaHaplo 100-fold faster than a nonparallel
version of the program.
Conclusion: ParaHaplo is a useful tool in conducting haplotype-based GWAS. Since the data sizes
of such projects continue to increase, the use of fast computations with parallel computing--such
as that used in ParaHaplo--will become increasingly important.
Articles for ParaHaplo are available at Source Code for Biology and Medicine.
ParaHaplo 1.0: http://www.scfbm.org/content/4/1/7.
ParaHaplo 2.0: http://www.scfbm.org/content/5/1/5.
ParaHaplo 3.0: http://www.scfbm.org/content/6/1/10.