Spark Summit 2017 Keynote Speakers

Matei Zaharia

Co-founder and Chief Technologist, Databricks

Eric Siegel

Founder and Author, Predictive Analytics World

Ali Ghodsi

CEO & Co-Founder, Databricks

Christopher Ré

Associate Professor, Stanford

Michael Greene

Vice President, Software and Service Group, Intel

Spark Summit 2017 by the Numbers

0 Days
0 Hours
0 Minutes
Countdown to Summit
6 Training Courses Check them out
179 Sessions See the full schedule
11 Tracks Data Science , Data Science 2, Developer, Enterprise, Machine Learning, Research, Spark Ecosystem, Use Cases, Sponsored Sessions, Streaming, Technical Deep Dives

JOIN Party - June 7 - Mezzanine

Don’t miss the closing party!


Wednesday, June 7

444 Jessie St

Featuring Rob Garza of Thievery Corporation

Live Stream Registration

Register to watch the Spark Summit keynotes for FREE via live web streaming.

The Spark Summit live stream will be active from 9:00-10:30 AM Pacific Time on Tuesday, June 6 through Wednesday, June 7, 2017.

Register now

The World’s Largest Event for the Apache® Spark™ Community

Join more than 3,000 developers, engineers, data scientists, researchers and business professionals for three days of in-depth learning and networking.

With over 170 sessions and ten tracks to choose from — including Developer, Data Science, Enterprise, Machine Learning and Streaming — there’s content for every level and role. You can also add on 1-Day or 2-Day training courses.

Learn about:

  • Structured Spark, Spark Streaming and related projects
  • The future of Apache Spark
  • How to use the Spark stack in a variety of applications
  • Best practices for deploying Spark at scale

Apache® Spark™ is a powerful open source processing engine built around speed, ease of use, and sophisticated analytics. It was started at UC Berkeley in 2009 and is now developed at the vendor-independent Apache Software Foundation. Since its release, Spark has seen rapid adoption by enterprises across a wide range of industries. Internet powerhouses such as Yahoo, eBay and Netflix have deployed Spark at massive scale, processing multiple petabytes of data on clusters of over 8,000 nodes. Apache Spark has also become the largest open source community in big data, with over 1000 contributors from 250+ organizations.

Conference Chairs

Photo of Reynold Xin

Reynold Xin


Reynold oversees technical contributions to Apache® Spark™ at Databricks, initiating efforts such as DataFrames and Project Tungsten. To demonstrate Spark’s scalability and performance, he lead the efforts in the 2014 Daytona GraySort contest and set the 2014 world record, beating the previous record held by Hadoop with 30X higher per-node efficiency. Prior to Databricks, he was a PhD student at the UC Berkeley AMPLab, where he focused on scalable data processing. He wrote the highest cited papers in SIGMOD 2011, 2013, and 2015, and won Best Demo Award at VLDB 2011 and SIGMOD 2012.

Photo of Edd Wilder-James

Edd Wilder-James

Silicon Valley Data Science

Founding chair of the pioneering data conference, O'Reilly Strata, Edd is a respected voice in the worlds of data, open source and the web. His work in emerging technology also includes six years as program chair of OSCON, and acting as the founding editor of the peer-reviewed journal "Big Data". He is currently VP of Technology Strategy at Silicon Valley Data Science.

Training Courses

Spark Summit 2017 features a number of 1-day and 2-day training workshops that include a mix of instruction and hands-on exercises to help you improve your Apache Spark skills. Training is offered as a standalone ticket; if you wish to attend any conference sessions on June 6 or 7, you must register for a conference pass as well.

Training courses and dates for Spark Summit 2017:

Exploring Wikipedia with Apache Spark
Just Enough Scala for Spark
SOLD OUT – Architecting a Data Platform
SOLD OUT – Data Science with Apache Spark 2.x
SOLD OUT – Apache Spark Intro for Data Engineering
SOLD OUT – Apache Spark Intro for Machine Learning and Data Science

Spark Summit Conference

Spark Summit 2017 has something for everyone, from developers and data scientists to researchers and business executives.

Find out what’s in store for:

Developer Day
June 6

Who It’s For:

  • Apache Spark Developers
  • Data Scientists
  • Infrastructure or Site Reliability Engineers
  • Researchers

Why You Should Attend:

  • Find out what lies ahead for the open source Spark project.
  • Hear how to improve performance and memory optimization from Spark committers.
  • Learn how to leverage Structured Streaming, and discover machine learning at scale.
  • Get tips and tools to process big data more quickly and efficiently from leading data scientists and researchers.

Enterprise Day
June 7

Who It’s For:

  • Data Practitioners
  • Key Decision Makers
  • Business Executives

Why You Should Attend:

  • See how leading companies successfully deploy Apache Spark at scale.
  • Get proven best practices to improve your Spark usage.
  • Learn how Spark is employed in a variety of enterprise applications.
  • Hear how other enterprise Spark users solve business problems.


Moscone West Convention Center

Moscone West Convention Center

Moscone West Convention Center
800 Howard Street
Corner of 4th & Howard
San Francisco, CA 94103

(415) 974-4000

Conveniently located in the South of Market area, Moscone West provides easy access to downtown San Francisco’s many hotels and restaurants giving opportunity to see what the city has to offer after the sessions close. Take advantage of easy transportation via BART, MUNI and CalTrain.

Learn more

Looking for a new gig? Check out the Job Board

Spark Summit 2017 Sponsors

If you have media questions, or would like to find out about sponsoring a Spark Summit, please contact