SESSION

Distributed Heterogeneous Mixture Learning On Spark

Slides PDF Video

NEC has developed a completely unique machine learning algorithm for FAB/HMEs (hierarchical mixture of experts using factorized asymptotic Bayesian inference) and is expanding data analytics business to enterprise customers. FAB/HMEs are highly-accurate and interpretable models which combine rule-based space partitioning and sparse linear models in individual partitions (a.k.a. piecewise sparse linear models). FAB/HMEs have already achieved many enterprise-level successes in real-world predictive analysis, as the core technology of so-called “Heterogeneous Mixture Learning”; e.g. energy/water demand forecasting to maximize utility of natural resource, sales forecasting to minimize food disposal in retail stores, repair parts demand prediction to optimize logistics inventory, and so on. In this session, we introduce our lessons learned from the development of the distributed learning algorithm for FAB/HMEs called dFAB and the algorithm implementation on Spark. To achieve the scale-out performance improvement on Spark, dFAB is carefully designed to reduce the communication cost between a driver process and executors and improve the multicore CPU utilization on each worker. There are two significant design features of dFAB. One is the RDD design which enables dFAB to exploit modern matrix calculation libraries like BLAS and Breeze. Another one is an efficient implementation of dFAB on Spark which reduces data transfers and the idle time of CPUs. We are going to disclose the experimental results which demonstrate the scale-out performance improvement of dFAB and its higher accuracy and interpretability than the algorithms currently implemented in Spark MLlib.

Masato Asahara, Computer Scientist at NEC

About Masato

Masato Asahara (Ph.D.) received his MS degree in computer science and Ph.D. from Keio University in 2007 and 2011, respectively. He currently works at NEC Knowledge Discovery Research Laboratories. His research mission and interest include distributed computing platforms for advanced predictive analysis using novel machine learning algorithms. He moved to the Cupertino office in 2015 and works for R&D and business developments in the field of advanced Big Data analytics solutions using NEC’s Heterogeneous Mixture Learning technology. http://www.nec.com/en/global/rd/crl/datamining/members/profile_asahara.html

Ryohei Fujimaki,  at NEC

About Ryohei

Ryohei Fujimaki (Ph.D.) received MS degree in aerospace engineering from University of Tokyo in 2006 and Ph.D. in 2010. He became the youngest research fellow ever in the history of NEC Labs. due to his business and R&D contributions, and is leading advanced analytics teams in US, Japan and China to develop global leading-edge technologies and business solutions. He has published papers in top conferences such as KDD, ICML, NIPS, as well as developed many predictive analysis solutions with clients. He is a recipient of the Advanced Technology award 2015 in Japan. http://www.nec.com/en/global/rd/crl/datamining/members/profile_fujimaki.html