
Staff Site Reliability Engineer
- Nederland
- Vast
- Voltijds
- Design and implement highly available, multi-cloud infrastructure
- Build observability and automation solutions that prevent incidents, not just detect them
- Lead technical initiatives across teams, mentoring engineers and driving SRE best practices
- Own production reliability, from architecture decisions to incident response
- Optimize performance across the full stack - from kernel tuning to application-level improvements
- 7+ years of SRE/DevOps experience with large-scale distributed systems
- Strong OpenStack experience (VIO, RHOSP, or similar distributions)
- Expert-level automation skills with Ansible and Terraform
- Strong programming skills in Python and Bash scripting
- Experience with Git-based CI/CD pipelines (GitLab CI, GitHub Actions, or similar)
- Deep understanding of networking (TCP/IP, BGP, load balancing, VPN technologies)
- Experience building and maintaining custom orchestration solutions
- Proven track record of improving system reliability and reducing operational overhead
- Experience with multiple OpenStack distributions and deployment models
- Kubernetes experience with managed services (GKE, EKS, AKS)
- Experience with SASE, SD-WAN, or zero-trust architectures
- Familiarity with FortiMonitor or similar monitoring platforms (Prometheus)
- Security certifications or demonstrated security expertise