Spark Summit 2014 brought the Apache Spark community together on June 30- July 2, 2014 at the The Westin St. Francis in San Francisco. It featured production users of Spark, Shark, Spark Streaming and related projects.
What are the challenges with testing the unique processing features of Spark and Spark Streaming applications? As with any growing technology with added/upgraded features for Spark, it is crucial for your team to ensure the quality and reliability of your Spark based applications. At Ooyala, we are working on batch and streaming pipelines setup using Spark. This requires test strategies that make us confident to deploy these services to production.
We have been working on automating various unit and integration level tests for Spark-based batch and streaming mode applications. As part of this effort, we worked on simulating cluster-like conditions and building utilities to feed data in real time for streaming applications. Today, we would like to share some of the challenges, test setup requirements, test strategies, potential solutions and best practices that we learned in the process of testing our spark applications.
Anupama Shetty is a Software Development Engineer in Test for Ooyala’s Analytics team. She has worked on big data processing platforms such as Hadoop, Kafka, Storm and Spark. She has built automation test frameworks for video players, API data verification and Spark applications such as Job Server and Streaming. She holds a Masters degree in Software Engineering from San Jose State University.
Neil Marshall is a Software Development Engineer in Test with the Video Analytics team at Ooyala. At Ooyala he created test frameworks for video analytic queries against Job Server. His experience in big data technologies includes Spark, Hadoop, Cassandra, Redis, Akka, Storm and Job Server. Before Ooyala he was a veteran programmer for 20+ years, owned a software consulting business and developed test automations and frameworks for TripIt, Salesforce and Microsoft. He holds a Bachelors degree in Computer Science and Mathematics from Gallaudet University.