Photo of

Joseph Bradley

Software Engineer, Databricks

Joseph Bradley is a Spark Committer working on MLlib at Databricks. Previously, he was a postdoc at UC Berkeley after receiving his Ph.D. in Machine Learning from Carnegie Mellon U. in 2013. His research included probabilistic graphical models, parallel sparse regression, and aggregation mechanisms for peer grading in MOOCs.


Combining the Strengths of MLlib, scikit-learn, and R

This talk discusses integrating common data science tools like Python pandas, scikit-learn, and R with MLlib, Spark’s distributed Machine Learning (ML) library. Integration is simple; migration to distributed ML can be done lazily; and scaling…