Spark Summit 2014 brought the Apache Spark community together on June 30- July 2, 2014 at the The Westin St. Francis in San Francisco. It featured production users of Spark, Shark, Spark Streaming and related projects.
To understand the brain, we must record from and manipulate as much of it as possible. New technologies enable recording large fractions of the nervous system in awake, behaving animals. But these techniques generate massive data sets — larger than any yet encountered in neuroscience. We are building a library, with Spark at the core, for finding patterns in complex, high-dimensional neural responses. Our library, called Thunder, is open-source, user-oriented, and applicable to a wide variety of neural data.
In this talk, I will describe our analyses — many using PySpark, integrated with Scipy and Scikit Learn — and show how we use them to uncover both the spatial organization of neural coding and the temporal structure of neuronal dynamics, in multiple animal systems, at the scale of large populations — in some cases the entire brain. We are also using Spark Streaming to perform analyses online during experiments. I will describe how we are integrating Spark Streaming and MLLib to develop a family of streaming machine learning algorithms. This platform provides a live window on the functioning brain, and enables a new paradigm for functionally targeted manipulations of neural activity.
Jeremy Freeman is a neuroscientist using computation to understand the brain. He obtained his BA from Swarthmore College in math, biology, and psychology. He completed a PhD in neural science at New York University, and is currently a Group Leader at HHMI’s Janelia Farm Research Campus. Freeman develops new approaches for analyzing, visualizing, and understanding large-scale patterns of neural activity. He hopes to reveal the deep principles according to which all brains function, including our own.