Site Reliability Engineer (SRE) – Storage

Scaleway · France, Paris

Hybrid Full-time Senior Infrastructure

Job Description

Why we need you:

Our growth means we’re strengthening our Site Reliability Engineer (SRE) team to guarantee the robustness and performance of our services. Your mission: continuously improve the reliability and scalability of our platforms, ensuring high-performing and resilient services while optimizing infrastructure through automation.

Your future team:

We work in a collaborative, international environment where diverse Scalers and a spirit of sharing bring new projects to life every day. You’ll join a team of experts focused on production stability and infrastructure efficiency. You’ll also join the SRE Guild, a collective dedicated to technical innovation and sharing best practices.

Your day-to-day:

Develop tools and frameworks to streamline deployments and infrastructure management.
Automate repetitive tasks to improve overall efficiency and system reliability.
Implement key indicators (SLOs, KPIs) to track and steer service performance.
Optimize monitoring and alerting systems to minimize alert fatigue.
Identify, diagnose and quickly resolve production incidents.
Analyze root causes and implement preventive measures.
Apply best practices: fault tolerance, load balancing and redundancy.
Collaborate with Dev and Product teams to integrate reliability from the design phase.
Participate in architecture reviews and spread SRE expertise across the organization.