Speaker

Xiao Li, Software Engineer at Databricks

Xiao Li

Software Engineer, Databricks

Xiao Li is a software engineer and Apache Spark Committer in Databricks. His main interests are on Spark SQL, data replication and data integration. Previously, he was an IBM master inventor and an expert on asynchronous database replication and consistency verification. He received his Ph.D. from University of Florida in 2011. He has over eight paper and eleven patents/applications in the field of data management.

Sessions

Building Robust ETL Pipelines with Apache Spark

Stable and robust ETL pipelines are a critical component of the data infrastructure of modern enterprises. ETL pipelines ingest data from a variety of sources and must handle incorrect, incomplete or inconsistent records and produce… Read more