Speaker

Li Jin, Distributed System Engineer at Two Sigma Investments, LP

Li Jin

Distributed System Engineer, Two Sigma Investments, LP

Li Jin is a distributed system engineer at Two Sigma. Li focuses on building high performance data analysis tools with Spark. Li is a co-creator of Flint: a time series analysis library on Spark. Previously, Li worked on building large scale task scheduling system. In his spare time, Li loves hiking, traveling and winter sports.

Sessions

Improving Python and Spark Performance and Interoperability with Apache Arrow

Apache Spark has become a popular and successful way for Python programming to parallelize and scale up data processing. In many use cases though, a PySpark job can perform worse than an equivalent job written… Read more