SESSION

Analyzing IOT Data in Apache Spark Across Data Centers and Cloud with NetApp Data Fabric and NetApp Private Storage

Slides PDF Video

This session will explain how NetApp simplifies the process of analyzing IoT data, using Apache Spark clusters across data centers and the cloud using NetApp Private Storage (NPS) for AWS/Azure, NetApp Data Fabric and NetApp Connectors for NFS and S3. IoT data originates at the edge in different geographical locations, and it can arrive at different data centers or the cloud depending on sensor location. The challenge is how to combine these different data streams across different datacenters to generate wider insights.

Learn how NetApp Data Fabric helps solve this challenge. In the Data Fabric architecture, the IoT data is ingested via Kafka into an Apache Spark cluster running in AWS/Azure, but the data is stored in NPS provisioned NFS share through NFS Connector. The IoT data in NPS can then be moved to on-prem datacenters, or on-prem IoT data can be moved to NPS or ONTAP Cloud for processing in AWS/Azure using NetApp SnapMirror Flex Clone or NFS Connector. We’ll also review how NetApp StorageGRID object storage maintains IoT data for archival purposes using S3 Target. The above options allow you to analyze IoT data from AWS, StorageGRID, HDFS or NFS, providing a feasible solution for deploying Spark clusters across datacenters.

Takeaways will include identifying Spark challenges that can be remedied by extending your Spark environment to take advantage of NPS; understanding how NPS and StorageGRID can provide a cost-effective alternative for dev/test, DR for Spark analytics; and understanding Spark architecture and deployment options that utilize data from multiple locations, including on-prem and cloud-based repositories.

Session hashtag: #SFeco4

Karthikeyan Nagalingam, Technical Marketing Engineer at NetApp

About Karthikeyan

I am working as a Bigdata Analytics Technical Marketing Engineer in NetApp Inc. I am architecting Hadoop solutions, Proof of concepts, presenting Hadoop solutions to customer, field experts, partners through events such as NetApp Insight, Forsight, Research triangle park local meetups, NetApp executive Briefing center, presales, postsales and assisting customers.

Nilesh Bagad, Product Management, Big Data, IoT at NetApp

About Nilesh

Nilesh leads product management for Big Data and IoT solutions for NetApp including build, buy and partner options across the stack. He is responsible for joint solution development and Go-To-Market initiatives with leading Hadoop and NoSQL partners. He holds an MBA from University of Texas, Austin and MS in computer engineering from University of Minnesota-Twin Cities.