Scaleway logo

Site Reliability Engineer (SRE) – Storage

Scaleway  ·  France, Paris
Hybrid Full-time Senior Infrastructure

Job Description

Why we need you:

Our growth means we’re strengthening our Site Reliability Engineer (SRE) team to guarantee the robustness and performance of our services. Your mission: continuously improve the reliability and scalability of our platforms, ensuring high-performing and resilient services while optimizing infrastructure through automation.

Your future team:

We work in a collaborative, international environment where diverse Scalers and a spirit of sharing bring new projects to life every day. You’ll join a team of experts focused on production stability and infrastructure efficiency. You’ll also join the SRE Guild, a collective dedicated to technical innovation and sharing best practices.

Your day-to-day:

  • Develop tools and frameworks to streamline deployments and infrastructure management.
  • Automate repetitive tasks to improve overall efficiency and system reliability.
  • Implement key indicators (SLOs, KPIs) to track and steer service performance.
  • Optimize monitoring and alerting systems to minimize alert fatigue.
  • Identify, diagnose and quickly resolve production incidents.
  • Analyze root causes and implement preventive measures.
  • Apply best practices: fault tolerance, load balancing and redundancy.
  • Collaborate with Dev and Product teams to integrate reliability from the design phase.
  • Participate in architecture reviews and spread SRE expertise across the organization.
Apply Now

You'll be redirected to the company's application page

Benefits

  • Hybrid work: up to 3 days of remote work per week.
  • Modern, well-located offices with outdoor spaces and bike parking.
  • Chef-prepared healthy meals at HQ, breakfast year-round at all sites.
  • Swile card for lunches at regional sites.
  • Gym access, daycare places, caring services support.
  • International environment with dozens of nationalities.
  • Career mobility and access to Iliad Group entities.

Requirements

  • Mastery of programming languages such as Python, Go or Rust.
  • Solid experience in Infrastructure as Code (IaC) and CI/CD pipelines (GitLab).
  • Mastery of Kubernetes, container images and Linux systems troubleshooting.
  • Experience with monitoring and logging tools (OpenMetrics, OpenTelemetry).
  • Knowledge of storage technologies such as S3, CephFS or ZFS.
  • Collaborative mindset and ability to work in an international environment.
  • Pragmatic approach to problem solving and incident management.
  • Excellent communication skills in English.
  • Interest in coaching and improving developer experience.
  • Proactivity and autonomy in a fast-evolving environment.