Senior Data Engineer

Date: 20 Dec 2025

Company: Qualitest Group

Country/Region: RO

We’re looking for a highly skilled and opinionated Senior Data Engineer to join a Data Platform team.

If you thrive in cloud-native, high-scale environments and enjoy full ownership of the data lifecycle - from design to deployment - this role is for you.

You’ll work closely with data scientists, ML engineers, and product teams to build scalable, production-ready data infrastructure leveraging Apache Spark, AWS services, and Infrastructure-as-Code.

Responsibilities:

Influence Data Architecture: Design scalable, secure data platforms using Spark (EMR), Glue, and AWS event-driven services.
Develop Scalable Pipelines: Build batch & streaming ETL/ELT pipelines with Spark EMR, Athena, Iceberg, EKS, Lambda.
Drive Innovation: Introduce patterns like data mesh, serverless analytics, and schema-aware pipelines.
Cross-Team Collaboration: Work with ML, backend, and product teams to deliver data-powered solutions.
Operational Excellence: Apply observability, cost control, performance tuning, and CI/CD automation using CloudWatch, Step Functions, Terraform/CDK.
Security: Implement AWS best practices for IAM, encryption, compliance, and auditability.

Requirements:

8+ years of experience in software development, at least 3+ years of experience in data engineering, with proven responsibility for designing, developing, and maintaining large-scale, distributed data systems in cloud-native environments (AWS).
End-to-end ownership of complex data architectures – from data ingestion to processing, storage, and delivery in production-grade systems.
Deep understanding of data modeling, data quality, and pipeline performance optimization.

Technical Expertise:

AWS Services: Strong hands-on experience with S3, Lambda, Glue, Step Functions, Kinesis, and Athena – including building event-driven and serverless data pipelines - MUST
Apache Spark: Expertise in writing efficient Spark jobs using PySpark, with experience running workloads on AWS EMR and/or Glue for large-scale ETL and analytical tasks.
Solid experience in building and maintaining both batch and real-time (streaming) data pipelines and integrating them into production systems.
Infrastructure as Code (IaC): Proficient in using Terraform, AWS CDK, or SAM to automate deployment and manage scalable data infrastructure.
Python as a primary development language (with bonus points for TypeScript experience).
Comfortable working in agile, fast-paced environments, with strong debugging, testing, and performance-tuning capabilities.