Effective programmers work in tight loops: making a small code edit, observing its effect on their system, and repeating. When your data is too big to read and your system isn’t local, println() won’t work. Fortunately, the Spark DataFrame and Dataset APIs have your back. Attendees will leave with better tools for exploring large datasets and debugging distributed code with Spark, and a better mental model of distributed programming at scale.
Session hashtag: #SFdev19
Sam Penrose loves how working with data at scale for Mozilla brings out the power and beauty of mathematics. Previously he helped Industrial Light and Magic bring the power and beauty of giant robots out to movie screens everywhere.