Going real-time is the next phase for big data, and streaming remains a primary mechanism to get there. Spark provides groundbreaking capabilities to handle real-time data, including streams and transformation. And retaining both real-time and historical data provides the most accurate mechanisms for predictive analytics and machine learning.
In this session, I, will outline architecting real-time data pipelines with the power of Apache Spark and a robust, distributed in-memory database.
In particular, I will detail how some of the world’s largest companies are running business critical applications using Spark.
Attendees will dive deep into the mechanics of real-time pipelines, the ability to durably store data, and how to instantly derive insights from billions of data points.
Nikita Shamgunov co-founded MemSQL and has served as CTO since inception. Prior to co-founding the company, Nikita worked on core infrastructure systems at Facebook. He served as a senior database engineer at Microsoft SQL Server for more than half a decade. Nikita holds a bachelor’s, master’s and doctorate in computer science, has been awarded several patents and was a world medalist in ACM programming contests.