Principal Engineer of Distributed Computing (Paxata)



With Paxata, our customers are securing our borders, tracking down money launderers, and delivering fresh yogurt to your local grocery. Pioneering the idea that you should not have to write MapReduce jobs to run a data-driven organization, Paxata empowers anyone to transform large, raw data sets into useful AnswerSets that fuel insight generation in ways not previously possible.

Innovation moves quickly on our technology platform that consists of semantic algorithms, Apache Spark, awesome UI, and a cloud architecture. The problems we solve are challenging and fun, like dynamically increasing our data capacity by an order of magnitude and enabling users to intuitively command complex data solutions. Our business opportunity is huge and our impact is real — in just a year since launch, over 150 Pax Power users across financial services, government, and consumer industries solve data problems with Paxata.

We are a tight-knit team of experienced and diverse problem solvers with track records of writing complex software, closing deals, and delivering customer success. We value results and learning and welcome you to join us in building the next great enterprise software company. If you want to build ground-breaking products, transform the lives of customers, and drive a culture of results and confidence, this is the team for you.

Founded in May 2012, Paxata is headquartered in beautiful downtown Redwood City, CA (by the Caltrain station).

What you would be doing:

  • Design and implement distributed computing architectures using both your own code and the latest advances in open-source
  • Work with the latest technologies including Apache Spark, Parquet, and the Hadoop stack
  • Develop and improve our in-memory database implementation for processing data throughout our workflow for creating master datasets — not just manage or use databases
  • Scale up our ability to handle 100Ms of customers records, then billions and beyond



  • Demonstrated experience designing and implementing complex data processing infrastructure such as MapReduce internals, database query optimizers, or compilers
  • Familiarity with Apache Spark or distributed databases
  • Industry experience as a senior software engineer or graduate level research experience in related topics
  • Excellent communication skills — no one works in a vacuum here
  • Strong interest in building a company to last



Job posted 6/12/2015