Site Reliability Engineer

Platform.sh

Overview

Key role in transitioning to Site Reliability Engineering model, focusing on reliability and efficiency.

Ideal candidate has solid understanding of SRE principles and hands-on experience with cloud platforms.

remotePrometheusGrafanaELK StackTerraformAnsibleAWSGCPAzureDockerKubernetes

Locations

United States, California
United States, Washington

Requirements

DevOps, Cloud Operations, or SRE expertise
Advanced Linux internals expertise
Proficiency in programming languages like Go or Python
Strong scripting skills in Python, Bash, or Go
Extensive experience with cloud platforms like AWS, GCP, and Azure
Hands-on experience with Docker and Kubernetes is a nice to have
Strong problem-solving skills and collaboration abilities

Responsibilities

Enhance system monitoring
Automate deployments and workflows
Optimize CI/CD pipelines
Manage cloud infrastructure
Support incident response and post-mortem
Collaborate with cross-functional teams
Drive technical innovation

Benefits

Flexible PTO
Company stock options
Professional development budget
Office equipment budget
Wellness budget
Annual team gatherings
Internet reimbursement
Inclusive parental leave
Remote work travel program