CoE - Digital Engineering - Data Scientist

Date: 17 Mar 2026

Company: Qualitest Group

Country/Region: IN

We are looking for a highly experienced Data Scientist and Machine Learning Engineer to lead the production deployment and optimization of our advanced semantic and keyword search platform. You will serve as the critical part of our Data Science and DevOps teams, ensuring our state-of-the-art models are designed, build and deploy efficiently, reliably, and at scale within our production environment. This role requires a deep, practical understanding of large-scale NLP models, efficient pipeline design, hands on coding in Sagemaker tooling and AWS cloud operations. Key Responsibilities DATA SCIENCE: Code and Design our Embeddings and Semantic Search pipelines. Includes a deep understanding of BM25 + ANN search, Cross-Encoder Reranking and Pagination. Experience in Embeddings, Vector systems and both Traditional and AI search components. Sagemaker studio, notebooks and Model selection required. Hands on coding in Python, Pytorch (props for ANN) and AWS Lambda Deployments. Understanding of CPU and GPU targeted deployments using AWS Lambda, Boto3, gunicorn and Sagemaker Endpoints.Technical Leadership: Serve as the subject matter expert on Semantic Search and Embeddings, explaining the components, model proper usage, and the impact of attribute settings on performance, pagination, and GPU scaling in real-world production deployments. Model Optimization & Scaling: Drive performance and memory/GPU utilization improvements in production by defining, developing and implementing cross-encoder / bi-encoder optimization strategies for rerankers. NLP and Transformer Expertise: Apply deep familiarity with SBERT and DistilBERT fine-tuning and strategies to enhance search relevance and efficiency. Pipeline Design: Architect and implement multi-step pipelines for Video and Image processing and embeddings, utilizing tools like TransnetV2, Transcription, Translation, and Hugging Face components. Performance Engineering: Critically understand and implement memory utilization and optimization approaches, including chunking and batching techniques for large-scale inference. Production Deployment: Lead the deployment of the semantic search platform, collaborating closely with the DevOps team to determine the operational approach across multiple AWS EC2 or Serverless platforms. System Ownership: Define and establish the model monitoring, application performance metrics, and automated retraining processes for the production system. Required Qualifications & Expertise 5+ years of experience as an ML Engineer focused on deploying and optimizing large-scale NLP or computer vision models in a production environment at scale. Mandatory: Proven experience with cross-encoder / bi-encoder optimization strategies for search rerankers. Expertise in SBERT and DistilBERT fine-tuning and deployment strategies. Deep practical knowledge of memory utilization and optimization techniques (e.g., chunking, batching) critical for high-throughput inference. Experience designing and implementing multi-step data processing pipelines using tools/libraries like Hugging Face, TransnetV2, and various Transcription/Translation services. Hands-on experience with GPU scaling, pagination, and performance attribute tuning in a production setting. Proficiency in cloud deployment technologies, preferably on AWS (EC2, Serverless, Sagemaker, etc.).