Apply now »

CoE - Digital Engineering - ML OPS Engineer

Date:  6 Apr 2026
Company:  Qualitest Group
Country/Region:  IN

Key Responsibilities

1. SageMaker Pipelines & Model Monitoring

  • Build and maintain SageMaker pipelines for embedding generation and NER workflows.
  • Extend pipelines for new workloads such as query reranking.
  • Implement end-to-end pipelines across Dev, QA, and Prod.
  • Develop custom monitoring (drift detection, latency, failure alerts) using SageMaker + AWS Lambda.
  • Enable A/B testing and production rollout of new models.

2. GPU/CPU Performance & Cost Optimization

  • Deploy and optimize GPU instances for high-throughput inference workloads.
  • Benchmark and select optimal instance types for:
    • Reranking models
    • Embedding pipelines
    • Vision inference systems
  • Implement Spot Instance strategies for large-scale workloads.
  • Optimize:
    • Batch sizes
    • Memory allocation
    • Concurrency
  • Manage inference services using Gunicorn, Boto3, and SageMaker Endpoints.

3. Search Infrastructure (Elasticsearch/Lucene)

  • Optimize hybrid search (BM25 + vector search).
  • Tune:
    • Index configurations
    • Sharding strategies
    • Query performance
  • Collaborate on ANN/HNSW tuning (ef_construction, M).
  • Balance recall, latency, and memory usage at scale.

4. Video & Image Pipeline Infrastructure

  • Deploy and scale pipelines for:
    • Video shot detection (TransnetV2)
    • Image embedding generation
  • Containerize and autoscale workloads.
  • Handle large-scale I/O efficiently.
  • Monitor latency, failures, and drift across media pipelines.

5. ML Deployment & Platform Engineering

  • Define standards for ML deployment and CI/CD pipelines.
  • Build deployment workflows across Dev, QA, and Prod.
  • Implement automated health checks and alerting using AWS Lambda.
  • Ensure consistent, scalable deployment practices.

6. Production Deployment & Cloud Operations

  • Lead deployment across:
    • AWS EC2
    • AWS Lambda
    • SageMaker Endpoints
  • Select optimal deployment strategies based on:
    • Latency
    • Throughput
    • Cost

Required Qualifications

  • 5+ years as an ML Engineer or MLOps Engineer in production environments
  • Strong experience deploying PyTorch and TensorFlow models
  • Hands-on expertise with:
    • AWS SageMaker (pipelines, endpoints, monitoring)
    • AWS Lambda
    • EC2 and cloud infrastructure
  • Proven experience with:
    • GPU/CPU optimization and benchmarking
    • Memory management and batch tuning
    • Spot Instance cost optimization
  • Experience with Elasticsearch/Lucene for large-scale search systems
  • Expertise in containerized deployments and autoscaling
  • Strong understanding of feature engineering for BERT-based models
  • Experience with model monitoring, evaluation, and A/B testing

Preferred Skills

  • CNNs, diffusion models, and deep learning architectures
  • Ranking systems (cross-encoder / bi-encoder)
  • Approximate Nearest Neighbors (HNSW)
  • Clustering (K-Means, DBSCAN)
  • Regression, decision trees, Bayesian methods
  • Experience with multimodal ML systems

Apply now »