Time-Evolving Graph Processing On Commodity Clusters

Slides PDF Video

Real-world graphs are seldom static. Applications that generate graph-structured data today do so continuously, giving rise to an underlying graph whose structure evolves over time. Mining these time-evolving graphs can be insightful, both from research and business perspectives. However, there is a lack of a general purpose distributed time-evolving graph processing engine. In this work, we present Tegra, a time-evolving graph processing system built on a dataflow framework. Tegra enables three broad classes of operations on evolving graphs: first, it enables storage, retrieval and bulk transformation of multiple graph snapshots efficiently using a persistent data structure based index. Second, it supports temporal graph analysis tasks such as evolutionary queries using a novel timelapse abstraction that lets it process multiple snapshots simultaneously with low overhead. Finally, Tegra enables a lightweight dynamic computation model that lets it do sliding window analytics on streaming graphs. We present an implementation of Tegra on Spark.

Ankur Dave, Graduate Student at AMP Lab UC Berkeley

About Ankur

Ankur is a third-year PhD student advised by Ion Stoica in the UC Berkeley AMPLab. He’s a Spark committer and a maintainer for GraphX.