o Parallel distributed processing based on the MapReduce paradigm with Spark or Apache Hadoop.
o Experience or knowledge with any of the main Big Data distributions: Cloudera, Hortonworks, MapR etc.
o Distributed processing of real-time data with Storm or Spark Streaming.
o Experience or knowledge of the tools available in the Hadoop ecosystem: Hive, Pig, Flume, Zookeeper, Sqoop, etc.
o Knowledge of NoSQL databases: Cassandra, MongoDB, HBase, Redis, Aerospike, etc.
o Cluster Resources Management: Mesos, YARN.
o Experience with recommendation engines and machine learning algorithms:
Apache Mahout, MLlib, etc.
o CEP engines such as Esper CEP or Siddhi.
o Experience with virtualization and cloud deployment AWS, JCloud, Docker.
o Experience with search engines based on Lucene such as Elasticsearch.
o Interest in learning and sharing knowledge with other team members.
o Huge capacity for learning and self-management.