While building and operating Databricks Cloud we have observed that, in practice, Data Scientists can be divided into three distinct categories or personas. While each persona works with data, they differ dramatically in the problems they solve, tools they use, and deliverables they produce. In this talk we will define and discuss these personas, provide a real-world example of each, as well as relate them to each other and to the decisions we’ve made while building Databricks Cloud.
Andy Konwinski is a Spark committer and founder of Databricks. Andy has been working on Spark since the early days of the project. He has contributed to the testing infrastructure, the performance evaluation framework, the website and other project infrastructure, and community evangelism. Andy led the creation of the UC Berkeley AMP Camps and the Spark Summits. He also is co-author of the O’Reilly book Learning Spark and leads the Bay Area Spark meetup group.