SESSION

Building Competing Models Using Apache Spark DataFrames

Slides PDF Video

Credit Karma serves over 60 million members wth vastly different credit profiles. Therefore, it is critically important that they recommend financial products that are tailored to each unique member, and that they do so efficiently at scale. To solve this recommendation problem, they build predictive models that compete against each other both vertically (in different stages of the conversion pipeline) and horizontally (for showing different offers) to provide the final recommendation.

In this session, Al-QawasmehI will demonstrate how Credit Karma builds predictive recommendation models using Spark DataFrames, and how they provide type safety while using DataFrames. See how Credit Karma’s choice of metric evaluation helps them calibrate the models to obtain the best global result, and hear about lessons learned when they scaled their model development environment to handle Terabyte-scale data with thousands of features.

You’ll leave with two key takeaways: an understanding of how model calibration plays a crucial role in the performance of a recommendation system redit Karma builds predictive recommendation models using Spark DataFrames and how we provide type safety while using DataFrames. I will show how our choice of metric evaluation helps us calibrate the models to obtain the best global result. I will also touch on lessons learned when we scaled our model development environment to handle Terabyte-scale data with thousands of features.
Participants will walk away with two takeaways: an understanding of how model calibration plays a crucial role in the performance of a recommendation system when competing models are involved, and an example of how to scale Spark.ml to work with Terabyte-scale data.

Session hashtag: #SFml11

Abdulla Al-Qawasmeh, Engineering Manager at Credit Karma

About Abdulla

Abdulla Al-Qawasmeh is an Engineering Manager at Credit Karma focusing on Machine Learning. His team builds the machine learning infrastructure used across Credit Karma. His team deals with the challenges of the scalability and composability of the machine learning infrastructure, automation, and the quality of the data used to build models. Abdulla holds a Ph.D. degree in computer engineering and M.S. and B.S. degrees in computer science.