Site Reliability Engineer – India

GoDaddy · India, Remote, India

Remote Full-time Senior Permanent IT & DevOps

Job Description

At GoDaddy, we are searching for an outstanding Site Reliability Engineer with exceptional skills to join our ambitious team in India. This role offers the chance to create, build, and maintain the infrastructure that powers the dreams of millions of entrepreneurs worldwide. You will drive reliability, observability, and cost efficiency across large-scale systems by crafting for resilience, automating operations, and proactively preventing incidents to ensure flawless system performance. Responsibilities include implementing end-to-end observability using Prometheus, Grafana, CloudWatch, and ServiceNow; defining and maintaining SLIs/SLOs/SLAs across infrastructure and applications; architecting and automating AWS infrastructure using CDK, CloudFormation, Python, Go, or Bash with deployments via GitHub Actions or Jenkins; managing and troubleshooting containerized workloads across Docker, Kubernetes (EKS), ECS, and Fargate; ensuring configuration consistency through Ansible, Puppet, or Chef; designing, building, deploying, and maintaining large-scale production-grade systems in AWS with end-to-end system ownership; driving platform reliability by proactively identifying risks, planning for scale and performance, collaborating with engineering teams to embed reliability and cost awareness; leading incident management with blameless postmortems and standardized SOPs using tools like BigPanda, Site24x7, and ServiceNow; enhancing infrastructure and CI/CD pipelines to improve performance and cost-effectiveness; taking ownership of capacity planning, forecasting, and governance. The candidate should have 5+ years of proven SRE experience supporting production-scale systems, strong understanding of SLIs/SLOs, distributed systems reliability, and troubleshooting complex issues. Proficiency with AWS services such as EKS, ECS, Fargate, EC2, S3, RDS, SQS, SNS, CloudFormation, CDK, IAM, and CloudWatch is required. Experience with incident management tools BigPanda and Site24x7, ServiceNow integration, configuration management tools Ansible, Puppet, Chef, and automation skills in Python/Go/Bash are essential. Skilled in CI/CD pipelines with GitHub Actions and Jenkins, container orchestration, and monitoring tools Prometheus, Grafana, and CloudWatch is expected. A bachelor’s degree or equivalent experience in computer science, engineering, or related technical field is preferred. GoDaddy offers a comprehensive benefits package including paid time off, retirement savings plans, bonus/incentive eligibility, equity grants, employee stock purchase plan, competitive health benefits, and family-friendly benefits including parental leave. The company embraces diverse culture and offers Employee Resource Groups. GoDaddy prioritizes diversity, equity, inclusion, and belonging in the workplace and is an equal opportunity employer willing to consider qualified applicants with criminal histories according to local regulations. Our recruiting team is available for application assistance via [email protected]. GoDaddy does not accept unsolicited resumes from recruiters or agencies.

More Offers from GoDaddy

BU Finance Manager

GoDaddy · Pune

Hybrid Full-time Senior

Android Software Engineer

GoDaddy · Colombia

Hybrid Full-time Not specified

Splunk Engineer – Security Detections – India

GoDaddy · India

Remote Full-time Mid

Principal Security Engineer

GoDaddy · India

Hybrid Full-time Lead

Backend Senior Software Engineer- Commerce Risk – Colombia

GoDaddy · Colombia

Remote Full-time Senior

Advertising Law Corporate Counsel

GoDaddy · Remote

Hybrid Full-time Not specified

Apply Now

You'll be redirected to the company's application page

Benefits

Paid time off
Retirement savings plans (401k, pension schemes)
Bonus/incentive eligibility
Equity grants and employee stock purchase plan
Competitive health benefits
Family-friendly benefits including parental leave
Employee Resource Groups supporting diverse culture

Requirements

5+ years SRE experience supporting production-scale systems
Strong understanding of SLIs/SLOs and distributed systems reliability
Proficiency with AWS services EKS, ECS, Fargate, EC2, S3, RDS, SQS, SNS, CloudFormation, CDK, IAM, CloudWatch
Experience with incident management tools BigPanda, Site24x7 and ServiceNow integration
Configuration management knowledge with Ansible, Puppet, Chef
Strong automation skills in Python, Go, Bash
Expertise in CI/CD pipelines using GitHub Actions, Jenkins
Skilled in monitoring and observability tools Prometheus, Grafana, CloudWatch
Bachelor’s degree or equivalent experience in computer science or engineering preferred