SESSION

Multi-Label Graph Analysis and Computations Using GraphX

Slides PDF Video

In real-life applications, we often deal with situations where analysis needs to be conducted on graphs where the nodes and edges are associated with multiple labels. For example, in a graph that represents user activities in social networks, the labels associated with nodes may indicate their membership in communities (e.g. group, school, company, etc.), and the labels associated with edges may denote types of activities (e.g. comment, like, share, etc.). The current GraphX library in Spark does not directly support efficient calculation on the label-defined subgraph analysis and computations.

In this session, the speakers will propose a general API library that is able to support analysis on multi-label graphs, and can be reused and extended to design more complicated algorithms. It includes a method to create multi-label graphs and calculate basic statistics and metrics at both the global and subgraph level. Common graph algorithms, such as PageRank, can also be efficiently implemented in a parallel scheme by reusing the module/algorithm in GraphX, such as Pregel API.

See how LinkedIn is able to leverage this tool to efficiently find top LinkedIn feed influencers in different communities and by different actions. can be reused and extended to design more complicated algorithms. It includes a method to create multi-label graphs and calculate basic statistics and metrics at both the global and subgraph level. Common graph algorithms, such as PageRank, can also be efficiently implemented in a parallel scheme by reusing the module/algorithm in GraphX, such as Pregel API.

See how LinkedIn is able to leverage this tool to efficiently find top LinkedIn feed influencers in different communities and by different actions.

Session hashtag: #SFml3

Qiang Zhu, Data Scientist at LinkedIn

About Qiang

Qiang Zhu is a Staff member of Business Analytics Data Mining team at LinkedIn. He and his team apply advanced Data Mining techniques to drive LinkedIn’s monetization efforts, ranging from a machine learning platform which powers member Email Marketing, to Sales Intelligence tools while help salespeople sell smarter. Prior to joining LinkedIn, he worked at StumbleUpon as a Data Scientist. Qiang holds a PhD in Computer Science from University of California, Riverside. His work has appeared in many top tier Data Mining conferences and journals, including the one which won the Best Paper Award in SIGKDD 2012.

Qingbo Hu, Senior Business Analytics Associate at LinkedIn

About Qingbo

Dr. Qingbo Hu received his Ph.D degree in 2016 from University of Illinois at Chicago, where his research advisor was Philip S. Yu. He is currently a senior business analytics associate in LinkedIn’s Analytics team. Dr. Hu has a broad interest in the research topics related to data mining/machine learning theories and techniques, as well as how to adopt them to solve real-life business problems. He has numerous research publications in many major data mining conferences, such as KDD, ICDM, SDM, WWW, CIKM and etc.