Speaker

Alexey Svyatkovskiy, Big Data Scientist at Princeton Institute for Computation science and Engineering

Alexey Svyatkovskiy

Big Data Scientist, Princeton Institute for Computation science and Engineering

Alexey Svyatkovskiy is a Big Data scientist at Princeton Institute for Computation Science and Engineering (PICSciE). He works on several projects including large-scale text processing and NLP with Spark ML in application to modern American politics, development of Histogrammar package – a cross-platform suite of data aggregation primitives for making
histograms, calculating descriptive statistics and plotting in Scala, and applications of scalable deep learning methods in plasma physics.

In his time free from research, Alexey organizes workshops on Apache Spark at Princeton University, contributes to local data science meetup group and PrincetonPy community.

Alexey holds a PhD. in particle physics. His thesis research focused on large-scale statistical analysis and charged particle reconstruction with the CMS experiment at the CERN LHC and he has been published in the leading journals in the field.

Sessions

Large-Scale Text Processing Pipeline with Spark ML and GraphFrames

In this talk we evaluate Apache Spark for a data-intensive machine learning problem. Our use case focuses on policy diffusion detection across the state legislatures in the United States over time. Previous work on policy… Read more