Filters (Clear filters)
Salary
Categories
Site reliability engineer
Add
Company
Work model
Employment type
Find your next tech job
Most relevant

Site reliability engineer jobs

Senior Site Reliability EngineerSenior Site Reliability Engineer
BrainGu
Grand Rapids, United States (city)
$150k - $170k
C
Docker
Kubernetes
Helm
Cloud
Terraform
AWS
GCP
Azure
Developer
Site reliability engineer
Solutions Architect
Ansible
Posted 2 days ago
IT & Security Admin in DevOps TeamNewIT & Security Admin in DevOps TeamNew
April
Tel Aviv, Israel (city)
DevOps
Python
Docker
Kubernetes
Cloud
Terraform
AWS
GCP
Azure
Git
Site reliability engineer
Bash
grpc
Linux
Grafana
Prometheus
Posted 6 days ago
Senior Site Reliability EngineerSenior Site Reliability Engineer
Convera
Vilnius, Lithuania (city)
€4k - €7k
Python
Cloud
Site reliability engineer
Rust
Linux
Grafana
Posted 13 days ago
Video Software Engineer (Mid - Staff Level)Video Software Engineer (Mid - Staff Level)
Mux
San Francisco, Argentina (city)
$188k - $240k
C
API
Golang
Cloud
Video
Open Source
Software engineer
Site reliability engineer
Posted 15 days ago
Engineering Manager, GitLab Delivery - ReleaseEngineering Manager, GitLab Delivery - Release
GitLab
Munich, Germany (city)
Back-end
Product Manager
Engineering Manager
AI
Open Source
Developer
Site reliability engineer
GitLab
Agile
Posted 15 days ago
Senior Site Reliability EngineerSenior Site Reliability Engineer
Prove
Ireland, Northern Europe (country)
Java
Python
Docker
Kubernetes
React
Site reliability engineer
Network
Crypto
Posted 18 days ago
Security Engineer, ProductionSecurity Engineer, Production
Niantic
Zürich, Switzerland (city)
Java
Python
Docker
Kubernetes
Cloud
AWS
Security engineer
Site reliability engineer
Posted 19 days ago
Site Reliability EngineerSite Reliability Engineer
Flow Traders
Eastern Asia, Asia (sub-continent)
DevOps
Python
Docker
Kubernetes
Search
Cloud
Terraform
AWS
GCP
Azure
Site reliability engineer
Bash
Kafka
Ansible
Posted 19 days ago
Senior Site Reliability EngineerSenior Site Reliability Engineer
Censys
Virginia, United States (region)
$145k - $195k
Python
Kubernetes
Helm
Cloud
Terraform
Developer
Site reliability engineer
Network
GitHub
grpc
Linux
Grafana
Prometheus
Posted 19 days ago
Cloud Engineering ManagerCloud Engineering Manager
Lirio
Tennessee, United States (region)
$190k - $225k
DevOps
Data science
ML Engineer
Engineering Manager
Architect
Java
Python
Docker
Kubernetes
Helm
Cloud
Postgres
Terraform
AWS
GCP
Azure
AI
Site reliability engineer
Network
Bash
Apache
Gradle
Oracle
Kafka
Linux
Ansible
Datadog
Posted 20 days ago
Senior Backend Engineer (Elixir)Senior Backend Engineer (Elixir)
Remote
Warsaw, Poland (city)
$51k - $116k
Front-end
Back-end
Docker
Kubernetes
React
Postgres
AWS
Vue.js
AI
Site reliability engineer
Phoenix
GitHub
GitLab
Jenkins
Angular
Posted 24 days ago
Senior Site Reliability EngineerSenior Site Reliability Engineer
Reach Financial
New York, United States (region)
$100k - $140k
Python
Javascript
Docker
Kubernetes
Cloud
Terraform
AWS
Typescript
Site reliability engineer
Kafka
GitHub
Grafana
Prometheus
Jenkins
Datadog
EC2
Lambda
Posted 24 days ago
DevOps EngineerNewDevOps EngineerNew
LastPass
Portugal, Southern Europe (country)
DevOps
Cloud
AWS
GCP
Azure
Developer
Site reliability engineer
Shell
Jira
GitLab
Linux
Unix
Posted 26 days ago
Site Reliability Engineer - EMEASite Reliability Engineer - EMEA
Appspace
Spain, Southern Europe (country)
QA Engineer
Python
Kubernetes
Helm
Cloud
Terraform
AWS
Azure
Git
Site reliability engineer
Shell
MySql
MongoDB
Jira
BitBucket
Linux
Posted 26 days ago
Staff Software Engineer, InfrastructureStaff Software Engineer, Infrastructure
Freenome
San Francisco, Argentina (city)
$188k - $288k
ML Engineer
Big Data Engineer
Python
Kubernetes
Cloud
Terraform
AWS
GCP
Azure
Software engineer
Site reliability engineer
Kafka
Linux
Prometheus
Posted 26 days ago
Senior Site Reliability Engineer - Data Pipeline teamSenior Site Reliability Engineer - Data Pipeline team
Bloomreach
Brno, Czech Republic (city)
€3k - €3k
Back-end
DevOps
Engineering Manager
Python
Kubernetes
Helm
Redis
Search
Marketing
Cloud
Terraform
GCP
AI
Site reliability engineer
Apache
Kafka
GitLab
Grafana
Prometheus
Posted 26 days ago
Site Reliability EngineerSite Reliability Engineer
ROLLER
Melbourne, Australia (city)
Architect
Cloud
AWS
Site reliability engineer
Network
Agile
Posted 27 days ago
Senior Cloud EngineerNewSenior Cloud EngineerNew
LendingTree
Seattle, United States (city)
$120k - $150k
Python
Cloud
Site reliability engineer
GitHub
GitLab
Android
iOs
Agile
Scrum
Posted 28 days ago
Senior Software EngineerSenior Software Engineer
Mozilla
United States, Northern America (country)
$126k - $168k
Front-end
Back-end
DevOps
Python
C
Javascript
Kubernetes
API
Cloud
React
AWS
GCP
SQL
Typescript
REST APIs
AI
Open Source
Developer
Full-stack
Site reliability engineer
Django
Grafana
S3 Bucket
Posted 28 days ago
Staff Site Reliability EngineerStaff Site Reliability Engineer
Aetion
Barcelona, Philippines (city)
DevOps
Big Data Engineer
Java
Python
Javascript
Docker
Kubernetes
Cloud
Video
Terraform
AWS
GCP
SQL
Typescript
Site reliability engineer
Network
Shell
GitHub
Agile
Linux
Unix
Jenkins
Ansible
Posted 29 days ago
Site Reliability Engineer III - Operations & ObservabilityNewSite Reliability Engineer III - Operations & ObservabilityNew
Rent the Runway
Galway, Ireland (region)
DevOps
Data science
ML Engineer
Docker
Kubernetes
Cloud
Terraform
AWS
GCP
Azure
Open Source
Full-stack
Site reliability engineer
Splunk
Posted 1 month ago
Senior DevOps Engineer (f/m/d)Senior DevOps Engineer (f/m/d)
Conductor LLC
Berlín, El Salvador (city)
DevOps
Kubernetes
Search
Cloud
Terraform
AWS
Developer
Site reliability engineer
ElasticSearch
GitHub
Linux
Unix
Grafana
Ansible
CircleCi
Posted 1 month ago
Site Reliability Engineer II (SRE II)Site Reliability Engineer II (SRE II)
OppFi
Chicago, United States (city)
$102k - $153k
DevOps
Java
Python
C
Javascript
Kubernetes
Cloud
Ruby
Terraform
AWS
GCP
Azure
Node.js
Site reliability engineer
Bash
GitHub
Linux
Grafana
Splunk
Prometheus
Ansible
CircleCi
Datadog
Chef
Posted 1 month ago
Senior Site Reliability EngineerSenior Site Reliability Engineer
Aerospike
Bengaluru, India (city)
DevOps
Python
Docker
Kubernetes
Cloud
Terraform
AWS
Azure
Developer
Site reliability engineer
Solutions Architect
Bash
ElasticSearch
Linux
Unix
Grafana
Prometheus
Datadog
Posted 1 month ago
Senior Site Reliability EngineerSenior Site Reliability Engineer
Striveworks
Austin, United States (city)
$150k - $190k
DevOps
ML Engineer
Python
Kubernetes
Helm
API
Cloud
Terraform
AWS
GCP
Azure
AI
Site reliability engineer
Network
Bash
Linux
Ansible
Posted 1 month ago
Staff Solutions Architect [REMOTE]Staff Solutions Architect [REMOTE]
Upbound
United States, Northern America (country)
Kubernetes
Sales
API
Cloud
Terraform
AWS
Open Source
Developer
Site reliability engineer
Solutions Architect
GitHub
Posted 1 month ago
Senior Site Reliability EngineerSenior Site Reliability Engineer
BrainGu
Boston, Philippines (city)
$150k - $170k
Docker
Kubernetes
Helm
Cloud
Terraform
AWS
GCP
Azure
Developer
Site reliability engineer
Solutions Architect
Ansible
Posted 3 months ago
Published: 2025-06-04  •  Austin, United States (city)
Terraform
AWS
Docker
C
Kubernetes
Helm
GCP
Azure
Cloud
Developer
Site reliability engineer
Solutions Architect
Ansible
$150k - $170k
On-site
Full-time

We are BrainGu

BrainGu is a technology company that builds developer platforms. We believe the future has to be innovated; it has to be created; it has to be secured. Through platforms that create order-of-magnitude improvements to quality in the form of resilience, scalability, reliability, and security – (rs)2 – we enable our customers to deliver the future.

Our mission is to dream of, incubate, and scale dual-use technology platforms that unlock innovation.

Our vision is to unlock innovation by enabling more organizations to build high quality software faster, and at lower cost.

Overview

This role sits within the Engineering Operations Value Stream (EngOps) supporting our flagship Developer Experience Platform, SmoothGlue.  As a member of the EngOps team, you will be responsible for working towards our SRE strategy and operating model and helping to mature our SRE discipline. 

Building iteratively with a strong understanding of the trade-offs required to implement SRE frameworks and capabilities is a must have as well as a strong willingness to collaborate. Automating yourself out of a job is not viewed as a risk but rather a worldview that is required in this role. 

You will work closely with our EngOps CTO and team as well as our Platform Product team to help inform and drive roadmaps, metrics, and overall organizational maturity. 

Responsibilities 

  • System Architecture and Design
    • Design, implement, and manage highly available, scalable, and fault-tolerant systems.
    • Collaborate with software engineering teams to optimize application performance and reliability.
    • Evaluate and recommend appropriate technologies, tools, and infrastructure solutions.
  • Infrastructure Automation
    • Develop and maintain infrastructure as code (IaC) using tools like Terraform, Ansible, or similar.
    • Automate deployment, configuration, and scaling of applications and services.
    • Implement continuous integration and continuous deployment (CI/CD) pipelines.
  • Monitoring and Incident Management:
    • Establish and maintain comprehensive monitoring, alerting, and logging systems.
    • Respond to incidents, troubleshoot issues, and ensure timely resolution to minimize downtime.
    • Participate in on-call rotations and post-incident analysis to drive continuous improvement.
  • Performance Optimization:
    • Analyze system performance and identify bottlenecks; implement optimizations.
    • Conduct capacity planning to anticipate future resource needs and scalability requirements.
    • Implement strategies to improve system response times and overall efficiency.
  • Security and Compliance:
    • Collaborate with security teams to implement best practices for system and data protection.
    • Ensure compliance with industry standards and regulations relevant to the company's operations.
  • Mentorship and Collaboration:
    • Provide guidance, mentorship, and technical leadership to junior SREs and engineering teams.
    • Foster a collaborative environment by sharing knowledge and promoting best practices.

Requirements 

  • Bachelor’s degree or equivalent work experience.
  • 6+ years of relevant work experience.
  • Highly motivated self-starter with excellent interpersonal and communication skills. Able to communicate efficiently at multiple levels of seniority.
  • Highly developed documentation skills
  • Experience working in customer facing role, customers may be end-user, developers, or org leadership
  • Certification or formal training in site reliability engineering concepts and practices
  • Prior experience working towards SLIs, SLOs and observability capabilities at a large scale.
  • Experience working on observability, logging and metrics toolsets.
  • Experience of k8s and container technologies such as Docker, Openshift, RKE and EKS.
  • Experience troubleshooting routing and networking in a cloud environment (AWS, GCP or Azure) 
  • Experience with Secrets products such as HashiCorp Vault or CyberArk.
  • Highly effective navigating large and complex organizations.
  • Ability to work under pressure and manage tight deadlines or unexpected changes in expectations or requirements.
  • Experience working in CISO or security led organisations desirable but not essential.
  • AWS Solutions Architect - Associate certification is preferred.
  • Must have an active Secret clearance

Tech Stack 

  • Kubernetes, Docker, Cri-O, Containerd, or other container technologies
  • Major programming or scripting languages
  • Istio, Linkerd, Consul, or other service mesh
  • Ansible, Terraform, Helm, Kustomize or other Infrastructure as Code (IaC) and Configuration as Code (CaC)
  • AWS, Azure, GCP, or other cloud technologies

Specific Job Needs

  • Located in one of the following locations: Austin, Grand Rapids, San Antonio, San Diego, Raleigh, Washington DC 
  • Requires active Secret U.S. Security Clearance, requiring U.S. Citizenship
  • Willing to travel up to 50%
  • Expected base salary of $150,000 - $170,000

Employee Perks

  • 12 weeks of fully paid parental leave for birth or adoption
  • 31 days of PTO, which includes federal holidays
  • 100% employer-paid insurance plans (employee-only)
  • 401(k) matching up to 5%
  • $10k “BrainBudget” to facilitate your personal and professional growth
  • $1,500 “Battle Station Budget” to outfit your home office with maximum RGB
  • 85% paid healthcare premiums for you, your spouse, and dependents
  • A monthly cell phone and internet stipend
  • Supplemental Tricare plan for Veterans
  • Monthly stipend for Veterans
Looking for talent?

Get in front of thousands of skilled ML/AI Engineers and discover a suitable candidate for your job opening.