More Algorithms and Tools for Genomic Analysis on Apache Spark

Slides PDF Video

Hammer Lab has built and maintains Pageant (, a parallel genomic analysis toolkit, which contains tools for analyzing genomic data on Spark as well as libraries for more general computations using RDDs.

Ryan will discuss some of the most interesting applications and algorithms therein:

• coverage-depth ( joint-histograms of coverage-depth for one or two genomic-read datasets

• guacamole ( work-in-progress somatic variant caller

• suffix-arrays ( proof-of-concept implementations of distributed-constructions of suffix-arrays and FM-indices

Session hashtag: #SFeco6

Ryan Williams, Software Developer at Mount Sinai School of Medicine

About Ryan

Ryan writes tools for analyzing genomic data using Spark at Hammer Lab.