A straight-up, no-fluff overview of using Elasticsearch and Spark to perform real-time indexing, search and data-analysis. This session will illustrate the rich integration between Spark and Elasticsearch from Hadoop Input/OutputFormat to the native Java and Scala API. We’ll also touch on Elasticsearch’s support for SparkSQL, one of the first integrations for this emerging technology. Throughout the presentation, Elasticsearch will serve as both a sink for Spark data and a data source serving search results back in. Lastly we will cover how the Elasticsearch engine can help simplify computations based on TF/IDF (term frequency/inverse document frequency).
Costin Leau is an engineer at Elasticsearch, leading the Big-Data/Hadoop efforts. An open-source veteran, Costin led various Spring projects and authored an OSGi spec. Speaker at various editions of Strata, Hadoop Summit, JavaOne, Devoxx/Javapolis, JavaZone, SpringOne on Java/Hadoop/Elasticsearch related topics.