Monitoring Engineer
Date:
10 Sept 2025
Company:
Qualitest Group
Country/Region:
IN
We are looking for a skilled and proactive Monitoring and Observability Engineer with strong expertise in Full Stack Observability to join our engineering organization. This role is critical to ensuring real-time visibility into the performance, health, and reliability of our distributed systems and applications. You will design, implement, and manage observability solutions leveraging tools like New Relic, and work closely with developers, DevOps, and SRE teams to drive performance optimization and incident reduction across the stackKey responsibilities Incident Support and Root Cause Analysis Collaborate in incident detection, response, and resolution by providing real-time observability insights. Perform root cause analysis using telemetry data and assist in post-incident reviews. Improve Mean Time to Detect (MTTD) and Mean Time to Resolve (MTTR) using data-driven approaches. Performance & Availability Optimization Monitor application and infrastructure performance proactively. Identify bottlenecks and trends impacting availability, latency, throughput, and error rates. Support load testing, scalability assessments, and SLAs/SLOs adherence through instrumentation and data analysis. Collaboration and Enablement Act as a subject matter expert on observability and monitoring best practices. Train and enable engineering and DevOps teams to self-serve observability insights. Stay current with trends in observability, AIOps, and telemetry technologieRequired QualificationEducation & Experience Bachelor’s degree in Computer Science, Engineering, or a related field. 4+ years of experience in observability, monitoring, or DevOps roles. 2+ years of hands-on experience with New Relic in production environments. Technical Skills Strong knowledge of monitoring and observability tools (New Relic, Datadog, Prometheus, Grafana, ELK, etc.). Experience instrumenting applications using OpenTelemetry or New Relic agents. Proficiency with scripting languages (Python, Bash) and CI/CD tools (Jenkins, GitHub Actions, etc.). Familiarity with microservices, Kubernetes, Docker, and public cloud platforms (AWS, Azure, GCP). Experience with log aggregation, distributed tracing, and APM at scale. Soft Skills Strong problem-solving skills and attention to detail. Effective communication and collaboration abilities across technical and non-technical teams. Comfortable working in a fast-paced, dynamic environment. 3 must havesSRE 4/5Monitoring 4/5