Big Data Engineer

We invite experienced Big Data Engineers ready to create a data streaming architecture for various projects, communicate with customers, embed new approaches, and lead the team. If you are a Big Data expert and want to work with up-to-date tools, we will be glad to see you on our team!

MAIN GOALS AND RESPONSIBILITIES:

  • Collaborate with product owners and team leads to identify, design, and implement new features to support the growing data needs;
  • Design, build and maintain optimal architecture for real-time streaming and big data analytics to extract, transform, and load data from a wide variety of data sources, including external APIs, data streams, and data lakes;
  • Implement data privacy and data security requirements to ensure solutions stay compliant with security standards and frameworks;
  • Apply MLOps and DataOps practices in real projects;
  • Monitor and anticipate trends in data engineering, and propose changes in alignment with organizational goals and needs;
  • Share knowledge with other teams on various data engineering or project-related topics;
  • Contribute to the core design of data architecture and implementation plan, define risks;
  • Collaborate with the team to decide on which tools and strategies to use within specific data integration scenarios.

REQUIRED SKILLS:

  • BSc in CS/Mathematics/Statistics/Physics/EE or equivalent experience;
  • 2+ years of proven experience developing large-scale software using an object-oriented or a functional language;
  • Strong programming skills in Python;
  • Proficient with stream processing using the current industry standards (e.g. AWS Kinesis, Kafka streams, Spark/PySpark, Flink, etc.);
  • Solid with distributed computing approaches, patterns, and technologies (Spark/PySpark, Kafka is a must. Hadoop, Storm, Hive, Beam as a plus);
  • Experience working with Cloud Platforms (GCP, AWS) and their data-oriented components (AWS EMR, AWS Athena, AWS Glue, AWS Redshift, Google BigQuery, Google PubSub, GKE, etc.);
  • Proficiency in SQL and query tuning;
  • Understanding of data warehousing principles and modeling concepts (Knowledge of data model types and terminology including OLTP/OLAP, (De)normalization, dimensional, star, snowflake modeling, cubes, and graph/NoSQL);
  • Experience with Orchestration of data flows (Apache Airflow);
  • a team player with excellent collaboration skills;
  • English level intermediate+.

WOULD BE A PLUS:

  • Experience in data science and machine learning with building Machine Learning models;
  • Experience with data integration, and business intelligence architecture;
  • Deep knowledge of Spark internals (tuning, query optimization);
  • Expertise in data storage design principles. Understanding of pros and cons of SQL/NoSQL solutions, their types, and configurations (standalone/cluster, column/row-oriented, key-value/document stores);
  • Experience with containerized (Docker, ECS, Kubernetes) or serverless (Lambda) deployment;
  • Good knowledge of popular data standards and formats (e.g, JSON, XML, Proto, Parquet, Avro, ORC, etc);
  • Experience with Informatica, Talend, Fivetran, or similar.

WHAT WE OFFER:

  • Large-scale projects;
  • Professional, friendly, and supportive team;
  • Prospects for career development in a team with a 27-year history;
  • Opportunities for professional growth that include participation in thematic events, English courses, certifications, paid participation in the largest industry conferences in the world;
  • Comfortable space in the city center, regular team buildings, holidays, and legendary corporate parties;
  • Flexible compensation review system;
  • Medical, sports and accounting support programs.

If you would like to join our team, please send your CV to [email protected] or fill out our online CV form. We look forward to meeting you!