Speaker

Vida Ha, Solutions Engineer at Databricks

Vida Ha

Solutions Engineer, Databricks

Vida is currently a Solutions Engineer at Databricks where her job is to onboard and support customers using Spark on Databricks Cloud. In her past, she worked on scaling Square’s Reporting Analytics System. She first began working with distributed computing at Google, where she improved search rankings of mobile-specific web content and built and tuned language models for speech recognition using a year’s worth of Google search queries. She’s passionate about accelerating the adoption of Apache Spark to bring the combination of speed and scale of data processing to the mainstream.

Sessions

Optimizing Apache Spark SQL Joins

Join operations in Apache Spark is often a biggest source of performance problems and even full-blown exceptions in Spark. After this talk, you will understand the two most basic methods Spark employs for joining dataframes… Read more