One of the largest financial institutions wanted to explore open source technologies for their asset regulatory reporting. The source transactional system had around 500 million asset transactions which grows at a rate of 30000 trades per hour. From the regulatory reporting point of view the trades had to be run through a series of checks from the point of time of transaction commit in the source system. The trades were pulled into a clustered HDFS set up. Spark SQL was used to run real time checks on the trade transactions and flag them to be potentially sent to the regulators.The processing which was traditionally batch, could be converted into real time processing using Spark. The use case demonstrates the newer horizons of Spark usage like real time transaction reporting in an Enterprise data warehouse setups. The talk would highlight the usage of Spark in the context of real time reporting.
Sudipto Shankar Dasgupta is a Principal Technology Architect and AVP with Infosys Ltd working on Big Data and Analytics platform development for large enterprises. Prior to that he was Chief Architect with SAP, working on HANA , in-memory optimizations for SAP installed bases.
Mayoor Rao is a Senior Architect with Infosys Ltd working on Big Data and Analytics platform development for large enterprises. Mayoor has been pivotal in building large scale big data platforms and holds rich experience and industry expertise in open-source technologies such as Hadoop, MongoDB and Apache Spark.