CoE - AI Data Services - Sr Data Engineer
Date:
3 Feb 2026
Company:
Qualitest Group
Country/Region:
IN
Job Description: 5 to 8 YearsRole Summary:Hands-on data engineer responsible for designing, building, and operating reliable, scalable data pipelines and platforms to enable AI-driven targeting, analytics, and agentic workflows. The role includes data acquisition, modeling, transformation, and integration with enterprise platforms (Databricks on AWS, Apache Spark, Veeva CRM). Familiarity with Citrix environments is required for secure network access.Key Responsibilities:? Design, build, and optimize scalable data pipelines for batch and streaming workloads using Databricks (AWS) and Apache Spark.? Ingest, clean, transform, and publish data from diverse sources, ensuring high data quality and reliability.? Model and harmonize data for curated datasets, semantic layers, and analytics-ready marts.? Integrate with Veeva CRM and other healthcare/CRM platforms for operational and analytics use cases.? Implement and monitor data validation, lineage, and governance practices; enforce compliance and privacy controls.? Collaborate with Product Owners, Data Scientists, and Full Stack Engineers to deliver business value and support AI/agentic workflows.? Operate and troubleshoot data platforms in Citrix environments; follow enterprise access and incident management protocols.? Document data processes, pipelines, and systems for reproducibility and knowledge transfer.Technical Requirements:? Strong hands-on experience with Databricks (AWS), Apache Spark, and cloud data platforms (AWS/Azure).? Proficiency in Python, SQL, and data pipeline orchestration tools (Airflow, Azure Data Factory).? Experience with data modeling, schema design, and optimization for relational and NoSQL databases.? Familiarity with Veeva CRM data structures, integration patterns, and authentication.? Knowledge of data governance, lineage, and compliance in healthcare/CRM contexts.? CI/CD, source management (GitHub), automated testing, and secure coding practices.? Experience with monitoring, logging, and alerting frameworks (CloudWatch, Datadog, Prometheus).? Comfortable working in Citrix environments for secure network access.Essential Skills:? Analytical, problem-solving, and investigative skills; ability to adapt to new technologies and data challenges.? Clear written and verbal communication; ability to collaborate with technical and non-technical stakeholders.? Experience with incident/problem management and working within enterprise standards.? Self-motivated; able to prioritize and complete multiple tasks independently; adaptable to changing priorities.? Healthcare/CRM domain exposure and delivery in cross-functional product squads.Qualifications:? Bachelor?s or Master?s in Computer Science, Data Engineering, or related field (or equivalent experience).? 5t o 8 years of professional experience in data engineering, pipeline development, and cloud data platforms.Nice to Have:? Experience with AI/LLM/agentic data workflows, feature engineering, and streaming analytics (Kafka).? Healthcare/CRM data experience and delivery in cross-functional product squads.Additional Comments:This is a remote role with engineers expected to connect to Lilly systems using their own systems via Citrix3 must haves