San Francisco
June 30 - July 2, 2014


Spark Summit 2015e
Spark Plugs Into Your Car
Arpan Ghosh (Automatic), Rob Ferguson (Automatic)

Automatic is the most widely used connected-car device in the emerging Internet of Things. Our connected-car adapter exposes a huge amount of data previously hidden within the car’s computer. This includes hundreds of measurements per minute of driving, ranging from velocity and location to mass air flow and intake air temperature. We leverage the Spark eco-system to sanitize and process this data, and generate insights for individual drivers as well as answer broader questions around transportation planning. Drivers’ understanding of a car’s efficiency typically ends with the EPA rating in the car’s window at purchase. Using Spark we relate car data to other cars’ in the same class, make, model, etc. We run batch jobs in Spark to train a unique physical model of a car based off real-world mechanical data for each Automatic-connected vehicle. We also generate an ‘expected’ model, based off EPA drive cycle data, for each supported make, model and year, and compare these to detect inefficient vehicle operation or driving behavior amongst our users. E.g. an attached ski-rack, under-inflated tires or aggressive acceleration. Automatic also detects events like hard braking and acceleration, speeding etc., along with their occurring location, while a trip is in progress. We upload these events in real-time to a Spark-Streaming pipeline for a geographical clustering followed by logic to detect trends that can indicate road hazards and opportunities for traffic planning improvements. E.g. blind intersections, inefficient signal placement or timing, poor road conditions etc. This talk will highlight how we use Spark and Spark-Streaming for the aforementioned applications and some novel techniques for analyzing and visualizing real-time automotive data.

Arpan Ghosh is a data engineer at Automatic. He works with data from Automatic-connected cars to generate personalized insights around efficient driving as well as build broader data products to power the emerging connected-car ecosystem. Arpan has used Spark for research in parallelized machine learning and building data-driven products since 2011. He is a contributor to the Kiji ( real-time, personalization engine. Arpan has a MS in Computer Science from Princeton.

Rob Ferguson is Director of Engineering at Automatic Labs making a smart connected car device and experiences to make driving smarter and safer. Previously, he founded the Data Science team at Rdio, a streaming music service, and delivered several data products including automated stations. He’s passionate about data products and insights with over 10 years professional development experience as well as research at Johns Hopkins and McGill Universities.

Slides PDF |Video