Software Engineer – Cloud Platform (Databricks)

At Databricks, we make Big Data simple. The state-of-the-art in Big Data is “simple things complex, complex things impossible.” We think the future should be “simple things easy, and complex things possible.” Join us and work with world’s leading experts in distributed systems, databases, and networking to help build a next-generation Big Data platform that users love.

Databricks’ vision is to make Big Data simple by empowering anyone to easily build and deploy advanced analytics solutions. At Databricks, we let users focus only on extracting value from their data, and we handle for them the challenging, data-agnostic, parts of analyzing Big Data, such as provisioning clusters, automatically scaling clusters, running workloads reliably in case of failures, optimizing the cluster performance, improving visibility into Spark’s execution or exposing the ability to immediately visualize results. The Databricks platform also provides just-in-time data integration, real-time experimentation, and robust deployment of production applications.

To achieve this, Databricks offers a cloud-hosted SaaS platform created around Spark. Building and maintaining such as platform has many challenges. There are inherent complexities involved in reliably managing big data platforms that must be resilient to many types of failures (e.g., instance failures, disk limits, memory limits, network failures, etc). In addition, the platform must operate at scale, and dynamically handle hundreds of customers and thousands of nodes. Moreover, the platform must be secure, since Databricks customers process extremely sensitive data. Furthermore, the platform must evolve rapidly, in order to keep improving our customer’s experience and must be updated continuously.

This is where you come in. You should be a software engineer passionate about architecting, developing, deploying and operating distributed software systems in the cloud. Your role will be to build services and features in the Databricks cloud platform while handling the aforementioned challenges. You will operate in and help extend a service oriented architecture and your software will interact with cloud provider APIs (such as AWS and Azure) as well as with internal service APIs and operating system concepts (such as containers). You will work generally on the Databricks cloud offering, that includes cluster management software (both proprietary and open source), cloud automation of networking setup, performance optimizations in storage access, etc. You are expected to keep raising our standards in terms of reliability, scalability, testing, deployment and security.

Databricks’ vision is to empower anyone to easily build and deploy advanced analytics solutions. The company was founded by the team who created Apache® Spark™, a powerful open source data processing engine built for sophisticated analytics, ease of use, and speed. Databricks is the largest contributor to the open source Apache Spark project providing 10x more code than any other company. The company has also trained over 40,000 users on Apache Spark, and has the largest number of customers deploying Spark to date. Databricks provides a virtual analytics platform, to simplify data integration, real-time experimentation, and robust deployment of production applications. Databricks is venture-backed by Andreessen Horowitz and NEA. For more information, contact


  • Create new features/services in the Databricks product and contribute to the existing features/services. This implies, among others, writing software in Scala and interacting with cloud APIs and internal APIs.
  • Be responsible for full software development lifecycle – design, development, testing, operating in production
  • Architect solutions to achieve a high level of reliability, scalability and security
  • Communicate effectively with other engineers in the same team, with other teams and with various other stakeholders such as product managers
  • Operate in an Agile development environment


  • Experience in architecting, developing, deploying and operating large scale systems in the cloud (e.g., a public cloud such as AWS, Azure or an advanced private cloud such as Google, Facebook)
  • Experience in designing cloud solutions that are reliable, evolvable, scalable and easily testable
  • Demonstrated ability to operate within in a SOA architecture and experience with API development
  • Experience using cloud APIs such as compute, storage, networking or cluster management
  • Demonstrated ability to work on large projects, a collaborative mindset and good communication skills
  • Working knowledge of Linux OS
  • Have a BS/MS degree in Computer Science, Engineering, or a related subject

Desired Skills

  • Good knowledge of Scala or Java
  • Experience building secure systems that handle sensitive data
  • Experience with container technologies, such as Docker
  • Experience working with Spark
  • Good knowledge of SQL (MySQL preferred)
  • Full stack experience
  • Knowledge of Python
  • Experience in Agile / SCRUM development environments


  • Medical, dental, vision
  • 401k Retirement Plan
  • Unlimited Paid Time Off
  • Catered lunch (everyday), snacks, and drinks
  • Gym reimbursement
  • Employee referral bonus program
  • Awesome coworkers
  • Maternity and paternity plans



Job posted 4/17/2017