SESSION

Building Genomic Data Processing and Machine Learning Workflows Using Apache Spark

Slides PDF Video

Epinomics is advancing epigenetic research to drive personalized medicine, using epigenomic data analysis. Their goal is to provide an analysis resource to the community that will promote high-quality data and replicable and interpretable results. They work with academic and commercial users to ingest and analyze their genomic sequencing data and metadata. They extract epigenetic features from the sequenced genome, called “chromatin accessibility”, which are indicative of instrumental epigenetic changes responsible for differential gene expression and disease development.

Epinomics has built an Apache Spark-based pipeline that retrieves chromatin accessibility data from the epigenome, uses GraphX to find overlapping accessibility atlas and then clusters the data and runs machine learning algorithms. This session will provide a primer on epigenomics, details about Epinomics’ Spark-based data pipeline focusing on parallel bioinformatic analysis, and how they use machine learning models to build the epigenomic landscape and accelerate the field of personalized immunotherapy. use GraphX to find overlapping accessibility atlas and then cluster the data and run machine learning algorithms.

In this talk we will provide a primer on epigenomics, details about our Spark based data pipeline focusing on parallel bioinformatic analysis and how we use machine learning models to build the epigenomic landscape and accelerate the field of personalized immunotherapy.

Session hashtag: #SFr11

Anupama Joshi, Technical Leader at Epinomics

About Anupama

At Epinomics, I am responsible for overall technical leadership for product development and delivery. I Manage the engineering team and the roadmap of Epinomics tech infrastructure and consumer facing product. I work on design and architecture of analytics infrastructure to process large amounts of NGS data. Work with a team of scientists to develop machine learning algorithms to find actionable insights in the genetic data.

Matt Negulescu, Senior Product Engineer at Epinomics

About Matt

At Epinomics, Matt is working on a new line of scientific management and epigenomic analytics products.