Photo of

Ashish Singh

Senior Software Engineer, PubMatic Inc.

Ashish is currently working as Senior Software Engineer in the Data Team at Pubmati. Most recently he has been working on upgrading audience analytics to Spark platform for efficient reporting. Ashish is interested in building large scale aggregation and analytics platforms and loves experimenting with Probabilistic data structures to solve complex compute problems.


Migrating Complex Data Aggregation from Hadoop to Spark

This talk discusses our experience of moving from Hadoop MR to Spark. Our initial implementation used a multiple stage aggregation framework within Hadoop MR to join, de-dupe, and group 12TB of incoming data every 3…