San Francisco
June 30 - July 2, 2014

Spark Summit 2014 brought the Apache Spark community together on June 30- July 2, 2014 at the The Westin St. Francis in San Francisco. It featured production users of Spark, Shark, Spark Streaming and related projects.


Spark Summit 2014
Spark on YARN: a Deep Dive
Sandy Ryza (Cloudera)

Spark’s YARN support allows scheduling Spark workloads on Hadoop alongside a variety of other data-processing frameworks. The talk will be a deep dive into the architecture and uses of Spark on YARN. We’ll cover the intersection between Spark and YARN’s resource management models. Attention will also be given to the different supported deploy modes and best operational practices. Finally, we’ll also discuss roadmap items, such as executor container resizing and integration with YARN’s application history store.

Sandy Ryza is an engineer on the data science team at Cloudera. He is a committer on Apache Hadoop and contributor to Apache Spark.

Slides PDF |Video