Spark Summit 2014 brought the Apache Spark community together on June 30- July 2, 2014 at the The Westin St. Francis in San Francisco. It featured production users of Spark, Shark, Spark Streaming and related projects.
During the past several years, Spark has significantly changed the landscape of big data computing. It improves applications’ performance dramatically. However, there still remains several challenges, e.g. high GC overhead. In this talk, I will introduce Tachyon, a distributed in-memory storage system. In addition, I will talk about how Tachyon can further improve Spark’s performance and the integration between the two systems.
Haoyuan Li is a Computer Science PhD student in the AMP Lab at UC Berkeley, working with Ion Stoica and Scott Shenker on computer systems and cloud computing. He is the lead developer of Tachyon distributed file system. Before Berkeley, he studied at Cornell University and Peking University, and worked at Conviva and Google.