Photo of

Shivaram Venkataraman

PhD Student, UC Berkeley AMPLAB

Shivaram Venkataraman is a fourth year PhD student at the University of California, Berkeley and works with Mike Franklin and Ion Stoica at the AMP Lab. He is a committer on the Apache Spark project and his research interests are in designing systems for large scale machine-learning. Before coming to Berkeley, he completed his M.S at the University of Illinois, Urbana-Champaign and worked as a Software Engineer at Google.


A Data Frame Abstraction Layer for SparkR

The data frame is a fundamental construct in R programming and is one of the primary reasons why R has become such a popular language for data analysis. In Spark 1.3, SparkSQL received its own…

Building Large Scale Machine Learning Applications with Pipelines

Real world machine learning applications typically consist of many components in a data processing pipeline. For example, in text classification, preprocessing steps like n-gram extraction, and TF-IDF feature weighting are often necessary before training of…

SparkR: The Past, the Present and the Future

The SparkR project provides language bindings and runtime support to enable users to run scalable computation from R using Apache Spark. SparkR has an active set of contributors from many companies and a number of…