At Databricks, we have a unique view into over a hundred different companies trying out Spark for development and production use-cases, from their support tickets and forum posts. Having seen so many different workflows and applications, some discernible patterns emerge when looking at common performance and scalability issues that our users run into. This talk will discuss some of these common common issues from an engineering and operations perspective, describing solutions and clarifying misconceptions.
Aaron Davidson is an Apache Spark committer and software engineer at Databricks. His Spark contributions include standalone master fault tolerance, shuffle file consolidation, Netty-based block transfer service, and the external shuffle service. At Databricks, he leads the Performance and Storage team, working on the Databricks File System (DBFS) and automating the cloud infrastructure,