Jump to:   Spark Training   Developer Day   Enterprise Day

 

Day 1 • Monday, June 6 • Training

7:00

Registration

9:00

Training: Apache Spark Essentials

Training: Data Science With Apache Spark

Training: Advanced: Exploring Wikipedia With Spark

12:00

Lunch

1:00 PM
6:30 PM

Meetup - Tensorflow on Apache Spark & Ask Me Anything

TensorFrames: Tensorflow on Spark DataFrames We will be holding a meetup during Spark Summit. Agenda to follow! We will have food and beverages available (sponsored by SAP) 6:30-7pm: Mingling 7-7:15: Welcome Opening remarks from SAP… Read more

 

Day 2 • Tuesday, June 7 • Conference

7:00

Registration

9:00

Apache Spark 2.0

The next release of Apache Spark will be 2.0, marking a big milestone for the project. In this talk, I’ll cover how the community has grown to reach this point, and some of the major… Read more
9:30
9:50
10:15
10:30

Break — Sponsored by Stratio Big Data

11:15
Data Science

Huohua: A Distributed Time Series Analysis Framework For Spark

Developer

Structuring Spark: Dataframes, Datasets And Streaming

Use Cases & Experience

Five Lessons Learned In Building Streaming Applications At Microsoft Bing Scale

Spark Ecosystem

Spark Uber Development Kit

Research

Low Latency Execution For Apache Spark

11:50
Data Science

Bolt: Building A Distributed ndarray

Developer

A Deep Dive Into Structured Streaming

Use Cases & Experience

Airstream: Spark Streaming At Airbnb

12:25
Data Science

Recent Developments In SparkR For Advanced Analytics

Use Cases & Experience

Building Realtime Data Pipelines with Kafka Connect and Spark Streaming

Spark Ecosystem

Elasticsearch And Apache Lucene For Apache Spark And MLlib

Research

Deploying Accelerators At Datacenter Scale Using Spark

12:55 PM

Lunch — Sponsored by Databricks

2:00 PM
Use Cases & Experience

Netflix - Productionizing Spark On Yarn For ETL At Petabyte Scale

Spark Ecosystem

Spark On Mesos: The State Of The Art

Research

Time-Evolving Graph Processing On Commodity Clusters

2:35 PM
Data Science

Huawei Advanced Data Science With Spark Streaming

Use Cases & Experience

Databricks' Data Pipelines: Journey And Lessons Learned

3:10 PM
Developer

Scalable Deep Learning Platform On Spark In Baidu

Use Cases & Experience

Scalable And Incremental Data Profiling With Spark

Spark Ecosystem

Spark And Cassandra: 2 Fast, 2 Furious

Research

A Graph-Based Method For Cross-Entity Threat Detection

3:45 PM

Break — Sponsored by Databricks

4:15 PM
Data Science

CaffeOnSpark: Deep Learning On Spark Cluster

Developer

High-Performance Python On Spark

Use Cases & Experience

Heterogeneous Workflows With Spark At Netflix

Spark Ecosystem

GPU Computing With Apache Spark And Python

Research

Massive Simulations In Spark: Distributed Monte Carlo For Global Health Forecasts

  • Kyle Foreman (University of Washington's Institute for Health Metrics and Evaluation)
4:50 PM
Data Science

Utilizing Human Data Validation For KPI Analysis And Machine Learning

Developer

Spark: Interactive To Production

Spark Ecosystem

Livy: A REST Web Service For Apache Spark

Research

Spatial Analysis On Histological Images Using Spark

5:25 PM
6:00 PM

Attendee Reception

 

Day 3 • Wednesday, June 8 • Conference

7:00

Registration

9:00
9:20
9:30
9:40
10:00

Pedal to the Metal: Accelerating Apache Spark with Innovations in Silicon Technology

Intel, a leading contributor to the Apache Spark project, is pioneering the creation of a highly optimized foundation built on the Intel architecture for large-scale distributed analytics with Spark. Learn how Intel’s forward-thinking innovations in… Read more
10:10
10:20

Spark 360 Panel

Apache Spark’s capabilities as distributed and fast computing framework extend across myriad industries, and Spark is deployed in various organizations. In this panel, we assemble notable representatives who can offer a 360 degree view of… Read more
10:45

Break

11:15
Data Science

Production Readiness Testing At Salesforce Using Spark MLlib

Developer

Deep Dive: Apache Spark Memory Management

Spark Ecosystem

Connecting Python To The Spark Ecosystem

Research

Solving The N+1 Problem In Personalized Genomics

11:50
12:25
Data Science

Apache Spark MLlib 2.0 Preview: Data Science and Production

Developer

Enhancing Spark SQL Optimizer With Reliable Statistics

Spark Ecosystem

Women in Big Data

12:55 PM

Lunch

2:00 PM
Data Science

GraphFrames: Graph Queries In Spark SQL

Use Cases & Experience

Lessons Learned From Running Spark On Docker

Enterprise

Is Apache Spark the Future of Data Analysis?

Spark Ecosystem

Morticia: Visualizing And Debugging Complex Spark Workflows

2:35 PM
Data Science

Finding Graph Isomorphisms In GraphX And GraphFrames

Use Cases & Experience

Top 5 Mistakes When Writing Spark Applications

Spark Ecosystem

Vertica And Spark: Connecting Computation And Data

3:10 PM
3:45 PM

Break

4:15 PM
Enterprise

The Internet of Everywhere—How IBM The Weather Company Scales

Spark Ecosystem

Reactive Streams, Linking Reactive Application To Spark Streaming

Research

Understanding Memory Management In Spark For Fun And Profit

4:50 PM
5:25 PM
Data Science

Locality Sensitive Hashing By Spark

Spark Ecosystem

Solr As A SparkSQL DataSource

Research

Handling Data Skew Adaptively In Spark Using Dynamic Repartitioning