Site Reliability Engineer II (x/f/m)
Published: 2025-07-18Improving Healthcare for Good Doctolib is Europe's leading ehealth company. We have one clear mission Our mission is to contribute to improving our healthcare systems in partnership with healthcare professionals. We aim at creating the largest community of healthcare providers in the world and build with them the practices and hospitals of the future. We believe that it's the best ...
Job details
Categories
Doctolib’s Engineering environment is rich and we are building innovative products and features aiming each day to ease doctors' and patient life. We are looking for a Site Reliability Engineer II to keep Doctolib production systems running smoothly. You will also be a key-player to support the exponential growth of Doctolib services.
What you will doAs a Site Reliability Engineer II at Doctolib, you will play a critical role in fostering a platform-oriented approach to reliability and performance, empowering teams to embrace the “You build it, you run it” culture.
Your role:
- Platform Reliability: Design, build, and maintain the core platform infrastructure to enable scalability and resilience, ensuring the platform can support hundreds of thousands of concurrent users
- Automation and Efficiency: Develop tools and processes to automate the deployment, scaling, and lifecycle management of services, reducing toil and increasing reliability
- Monitoring and Incident Management: Implement robust monitoring, alerting, and incident response mechanisms to detect and resolve issues before they impact practitioners or patients
- Disaster Recovery: Design and execute disaster recovery strategies to ensure business continuity in critical scenarios
- Collaborate with Feature Teams: Partner with product and engineering teams to embed reliability best practices, enhance performance, and instill operational excellence into their workflows
- Continuous Improvement: Research and evaluate emerging technologies and tools to continuously enhance platform reliability and operational practices
- On-Call Ownership: Participate in an on-call rotation to maintain a proactive, efficient response to incidents, reinforcing the “You build it, you run it” philosophy
You could be our next team mate if you:
- Have a solid hands-on experience (3y+) on a large-scale production platform
- Have proven experience with cloud platforms such as AWS, Azure or Google Cloud
- Have solid understanding of containerization and orchestration technologies (Docker and Kubernetes)
- Have a strong understanding of Helm for managing Kubernetes manifests and ArgoCD for GitOps workflows
- Have proficiency in at least one programming language (Ruby, Python, Go, Java, etc.) and a deep understanding of infrastructure as code principles
- Have an experience with monitoring and observability tools
- Like troubleshooting performance issues in complex environments
- Speak English
- Free Health Insurance for you & your family
- Up to 14 days of RTT
- Parental care program (1 month off in addition to the legal parental leave and 0,5 days off per child when the school starts)
- Wellbeing program (free mental health and coaching offer with our partner moka.care)
- A flexible workplace policy offering both hybrid and office-based mode
- Flexibility days allowing to work in EU countries and the UK 10 days per year
- Lunch voucher with Swile card
- Work Council subsidy to refund part of sport club membership or creative class
- Bicycle subsidy
- Recruiter interview
- Technical SRE interview
- System Design interview
- Behavioral interview
- Background / Reference check
- Offer!