San Francisco
June 30 - July 2, 2014


Spark Summit 2015e
HeteroSpark: A Heterogeneous CPU/GPU Spark Platform for Deep Learning Algorithms
Peilong Li (U of Massachusetts Lowell), Yan Luo (U of Massachusetts Lowell)

Various deep learning networks such as deep autoencoders, convolutional deep neural networks, and deep belief network have been widely applied to current state-of-the-art technologies. As recent technology advances in GPU lead superior performance over CPUs, GPUs have been increasingly used in a broad area of applications, such as machine learning, image processing and bioinformatics, etc.. However, single-GPU or GPU-only cluster face the challenge of dealing with large scale datasets or non-trivial development effort. Spark, as a CPU-only computing platform, provides fast in-memory computing capability, which is ideal for iterative computation such as machine learning applications. However, Spark slave nodes can achieve better performance and energy efficiency if with GPU acceleration. In this context, we propose HeteroSpark, a GPU-accelerated heterogeneous architecture integrated with Spark, which combines the power of GPU and scalability of Spark. We make the following contributions to the Spark community: (1) we integrate the GPU accelerator into current Spark platform to achieve further data parallelism and algorithm acceleration; (2) we provide a plugin for Spark platform so that current Spark applications can choose to enable/disable GPU acceleration; (3) accelerators are transparent to developers, therefore existing Spark applications can be easily ported to the heterogeneous platform without code modifications.

Peilong Li received the BS degree in electrical engineering from Qingdao University of Science and Technology, China, in 2007. He is currently a Ph.D. candidate in Electrical and Computer Engineering, University of Massachusetts Lowell. Peilong is a research assistant working in the laboratory of Advanced Computing and Networking Systems from University of Massachusetts Lowell. His research interests include power-efficient cloud and mobile computer architecture, and big data analytics.

Dr. Yan Luo is an Associate Professor of the Department of Electrical and Computer Engineering at the University of Massachusetts Lowell. He obtained his Ph.D. in Computer Science from University of California Riverside in 2005 and joined the faculty of UMass Lowell in the same year. While his research interests span computer architecture and network systems, Prof. Luo’s current projects focus on programmable network processing, heterogeneous architecture and systems, and smartphone based computing. He and his team aim to design and build novel microprocessors and systems to facilitate intelligent networking, deeply embedded sensing, and medical applications.

Dr. Yu Cao: an Assistant Professor of Department of Computer Science at University of Massachusetts Lowell. His research interests span a variety of aspects of knowledge discover from complex data, which include the area of biomedical informatics and intelligent system. Particular focus on scientific applications including: intelligent, multi-modal, and data-intensive medical image analysis and retrieval; motion tracking, analyzing, and visualization; and intelligent data analysis for electronic medical records and pervasive healthcare monitoring. His research work has appeared in various prestigious journals, book chapters, and refereed conference proceedings. His research has been supported by both NSF and NIH. He was a guest editor of a special issue Springer Multimedia Tools and Applications (MTAP), and a special issue in Journal of Multimedia (JMM). He has served on Organizing Committees or Programming Committees of more than 10 international conferences and workshops. He is a member of ACM, IEEE, and Upsilon Pi Epsilon (UPE).

Ning Zhang: a first year PhD student of Department of Computer Science at University of Massachusetts Lowell. His research interests are machine learning and distributed systems.

Slides PDF |Video