Principal Site Reliability Engineer
Groupon
Overview
Role focused on ensuring performance, availability, and resilience of platforms.
Ideal candidate has 10+ years in systems engineering with expertise in cloud platforms and SRE.
remoteseniorTerraformAnsibleKubernetesDockerPythonGoBashPrometheusGrafanaELK Stack
Locations
Requirements
10+ years in systems engineering 5+ years in SRE or DevOps Expertise in cloud platforms Proficiency in programming languages Advanced knowledge of IaC tools Deep understanding of networking principles Proven track record in high-availability systems Exceptional analytical skills
Responsibilities
Architect and maintain fault-tolerant systems Drive automation in infrastructure management Create and optimize CI/CD pipelines Build observability solutions Collaborate on SLIs, SLOs, and error budgets Design performance testing and scalability strategies
Benefits
Cutting-edge technologies Collaborative work culture Professional growth opportunities