Site Reliability Engineer – Storage Engineer

GoDaddy · United States, Austin, Texas

Hybrid Full-time Entry Engineering

Job Description

Location Details:

At GoDaddy the future of work looks different for each team. Some teams work in the office full-time; others have a hybrid arrangement (they work remotely some days and in the office some days) and some work entirely remotely.

This is a remote position, so you’ll be working remotely from your home. You may occasionally visit a GoDaddy office to meet with your team for events or meetings.

Join Our Team

GoDaddy is seeking a highly skilled and motivated Site Reliability Engineer (SRE) to join our dynamic team. This role will focus on automating and maintaining our storage infrastructure with a focus on Ceph, ensuring the reliability, scalability, and performance of our systems.

Compensation:

The estimated pay ranges for this role are listed below. In addition to base pay, this role may be eligible for other forms of compensation, which may include a corporate bonus and/or equity awards, subject to the terms of applicable plans and individual eligibility.

Bay Area (Santa Clara, San Francisco) and Los Angeles:$128,000-$192,000 USDAustin, D.C. Metro, CA (non-Bay Area), HI, IL, MA, NH, OR, VA, WA:$110,500-$165,500 USDNew York City Metro, Kirkland/Seattle:$117,200-$175,800 USDAll other US locations not previously listed:$98,500-$147,500 USD

More Offers from GoDaddy

Sales Development Representative

GoDaddy · Remote

Hybrid Full-time Not specified

Product Manager – UX Platform and Localization

GoDaddy · Ontario

Hybrid Full-time Senior

$111,000-$167,000

Advertising Copywriter – Email and Direct Response

GoDaddy · Remote

Remote Full-time Senior

Site Reliability Engineer – Bulgaria

GoDaddy · Remote

Remote Full-time Senior

Senior Site Reliability Engineer – India

GoDaddy · India

Remote Full-time Senior

Senior Human Resources Business Partner (Canada)

GoDaddy · Ontario

Hybrid Full-time Senior

128,500 - 192,500 CAD

Apply Now

You'll be redirected to the company's application page

Benefits

Automate and maintain day-to-day operations of storage systems to support application demands
Develop and maintain tools and automation scripts to streamline storage operations and improve efficiency
Monitor system performance, identify issues, and implement solutions to ensure high availability and reliability
Participate in agile concepts such as daily stand-up meetings, task tracking boards, design and code reviews, automated testing, continuous integration, and deployment
Continuously improve system reliability, performance, and capacity through proactive monitoring, automation, and optimization
Experience with containerization and orchestration tools (e.g., Docker, Kubernetes)
Exposure to and experience working with compute platforms (e.g., OpenStack, AWS)
Familiarity with ability to contribute to CI/CD pipelines and automation workflows

Requirements

2+ years of professional experience with Ceph, working in a production environment
2+ years of experience in site reliability engineering or a similar role
2+ years of professional experience with Ceph, including deployment, configuration, and management of Ceph clusters and systems
Experience working on Linux/Unix systems, with a focus on automation and operating at scale
Proficiency in Python or Bash
Experience with Ansible, Terraform, or SaltStack
Experience with Nagios-based monitoring tools, such as Icinga2
Experience with observability tooling, such as Prometheus, Grafana, Mimir, and Loki
Solid understanding of core networking concepts and protocols, particularly in relation to Linux/Unix systems