San Francisco
June 30 - July 2, 2014

Spark Summit 2013 brought the Apache Spark community together on December 2-3, 2013 at the Hotel Nikko in San Francisco. It featured production users of Spark, Shark, Spark Streaming and related projects.


Spark Summit 2013
SIMR: Let your Spark Jobs Simmer Inside Hadoop Clusters
Ahir Reddy, Databricks

SIMR — Spark Inside MapReduce — allows Spark jobs to run on any Hadoop MapReduce cluster. This eases deployment of Spark on MapReduce clusters, which do not need to have any advance notion of Spark or Scala. SIMR is simply a Hadoop MapReduce job that comes with a JAR that includes Spark and Scala. SIMR can launch any arbitrary Spark job inside the mappers of its MapReduce job, while redirecting input/output to your terminal. SIMR has a special Spark Shell-mode in which it runs the Spark shell on the MapReduce cluster, giving you interactive access to Spark on existing MapReduce clusters.

Slides PDF |Video