Speaker

Tejas Patil, Software Engineer at Facebook

Tejas Patil

Software Engineer, Facebook

Tejas is a software engineer at Facebook. For the past 3 years, he has been part of the Data Infrastructure group at Facebook and primarily works on building large scale distributed data processing systems responsible for handling batch workloads. He is currently a PMC member and committer of Apache Nutch and has contributed to several open source projects. Tejas obtained a Master’s Degree in Computer Science from University Of California, Irvine.

Sessions

Hive Bucketing in Apache Spark

Bucketing is a partitioning technique that can improve performance in certain data transformations by avoiding data shuffling and sorting. The general idea of bucketing is to partition, and optionally sort, the data based on a… Read more