GoDaddy logo

Site Reliability Engineer – Storage Engineer

GoDaddy  ·  United States, Austin, Texas
Hybrid Full-time Entry Engineering

Job Description

Location Details:

At GoDaddy the future of work looks different for each team. Some teams work in the office full-time; others have a hybrid arrangement (they work remotely some days and in the office some days) and some work entirely remotely.

This is a remote position, so you’ll be working remotely from your home. You may occasionally visit a GoDaddy office to meet with your team for events or meetings.

Join Our Team

GoDaddy is seeking a highly skilled and motivated Site Reliability Engineer (SRE) to join our dynamic team. This role will focus on automating and maintaining our storage infrastructure with a focus on Ceph, ensuring the reliability, scalability, and performance of our systems.

Compensation:

The estimated pay ranges for this role are listed below. In addition to base pay, this role may be eligible for other forms of compensation, which may include a corporate bonus and/or equity awards, subject to the terms of applicable plans and individual eligibility.

Bay Area (Santa Clara, San Francisco) and Los Angeles:$128,000-$192,000 USDAustin, D.C. Metro, CA (non-Bay Area), HI, IL, MA, NH, OR, VA, WA:$110,500-$165,500 USDNew York City Metro, Kirkland/Seattle:$117,200-$175,800 USDAll other US locations not previously listed:$98,500-$147,500 USD

Apply Now

You'll be redirected to the company's application page

Benefits

  • Automate and maintain day-to-day operations of storage systems to support application demands
  • Develop and maintain tools and automation scripts to streamline storage operations and improve efficiency
  • Monitor system performance, identify issues, and implement solutions to ensure high availability and reliability
  • Participate in agile concepts such as daily stand-up meetings, task tracking boards, design and code reviews, automated testing, continuous integration, and deployment
  • Continuously improve system reliability, performance, and capacity through proactive monitoring, automation, and optimization
  • Experience with containerization and orchestration tools (e.g., Docker, Kubernetes)
  • Exposure to and experience working with compute platforms (e.g., OpenStack, AWS)
  • Familiarity with ability to contribute to CI/CD pipelines and automation workflows

Requirements

  • 2+ years of professional experience with Ceph, working in a production environment
  • 2+ years of experience in site reliability engineering or a similar role
  • 2+ years of professional experience with Ceph, including deployment, configuration, and management of Ceph clusters and systems
  • Experience working on Linux/Unix systems, with a focus on automation and operating at scale
  • Proficiency in Python or Bash
  • Experience with Ansible, Terraform, or SaltStack
  • Experience with Nagios-based monitoring tools, such as Icinga2
  • Experience with observability tooling, such as Prometheus, Grafana, Mimir, and Loki
  • Solid understanding of core networking concepts and protocols, particularly in relation to Linux/Unix systems