Home India Site Reliability Engineer

Home India Site Reliability Engineer

Site Reliability Engineer

Full time at ZEISS India in India
Posted on January 21, 2025

Job details

Job Purpose As a Site Reliability Engineer for our digital business and e-commerce solutions, you will work within a cross-functional, international team of product owners, UX/UI designers, analytic experts, engineers, and architects. You will play a critical role in ensuring our services are always available, scalable, and engineered to withstand unparalleled demand. You will be involved in incident management, troubleshooting, and root cause analysis, with a strong emphasis on automation and improving our operational processes. Reporting Manager: Architect – Site Reliability Engineer This an Individual Contributor role Roles & Responsibilities Incident Management • Resolve incidents, drive postmortems reviews for improving the service quality. • Handle resolution of blockers and escalation to stakeholders. Monitoring and Observability • Work closely with Dev and SRE team to select appropriate metrics related to observability and reliability. • Measure the reliability of the service using SLI, SLOs and consider risk minimization of service degradation. Documentation • Maintain the required documentation method and tools. Build Playbooks for troubleshooting techniques to effectively identify and investigate issues that can be used by SREs. Updates & Migrations • Support the development team to bring new software or new features (Digital Offering) to production as quickly as possible, while also ensuring an agreed-upon acceptable level of IT operations performance and error risk in line with the service level agreements (SLAs) agreed. • Work with different Product Owners, Site Reliability Engineers, and the Cloud Platform Teams to migrate between different Cloud Platforms while ensuring reliability for business offerings. Education & Work Experience • 3-6 years of relevant industry experience. • Minimum of 2 years’ experience as a Site Reliability Engineer. • Minimum of 2 years’ experience with cloud computing platforms like Azure and related services. • Good knowledge of system architecture, networking, and microservice based distributed systems. • Expertise in designing and implementing reliable, scalable, and fault-tolerant systems using container Orchestration Technologies like Docker and Kubernetes. • Proficiency in setting up and managing monitoring, alerting, and logging systems for early detection and resolution of issues for container orchestrators like Kubernetes using Tools like Prometheus, Grafana, Open Telemetry Collector or similar tools. • Hands-on experience in incident management, including incident response, troubleshooting, and post-mortem analysis. • Proficiency in coding/scripting languages commonly used in infrastructure automation and monitoring (such as Terraform). • Knowledge of best practices in disaster recovery planning and execution for cloud-based Systems. • Capability to advocate for SRE best practices and principles within the organization and drive cultural changes as needed. Do you know your way around these? Then our team would love to hear from you! ZEISS in India ZEISS in India is headquartered in Bengaluru and present in the fields of Industrial Quality Solutions, Research Microscopy Solutions, Medical Technology, Vision Care and Sports & Cine Optics. ZEISS India has 3 production facilities, R&D center, Global IT services and about 40 Sales & Service offices in almost all Tier I and Tier II cities in India. With 2200+ employees and continued investments over 25 years in India, ZEISS’ success story in India is continuing at a rapid pace. Further information at ZEISS India ()

Apply safely

To stay safe in your job search, information on common scams and to get free expert advice, we recommend that you visit SAFERjobs, a non-profit, joint industry and law enforcement organization working to combat job scams.

Share this job
See All Site Jobs
Feedback Feedback