One

Site Reliability Engineer

One

Overview

Role focused on ensuring the reliability of critical services and establishing SRE practices.

Ideal candidate has 10+ years in distributed systems and strong observability platform experience.

170k usd / yearremoteseniorfull-timeDatadogSplunkPrometheusGrafanaPythonTypeScriptGo

Locations

  • United States

Requirements

  • 10+ years experience
  • 5+ years with observability platforms
  • Fluency in programming languages
  • Strong software development practices
  • Self-motivated and inquisitive
  • Great communication skills
  • Humble, Hungry, Honest
  • Act-like-an-owner mentality

Responsibilities

  • Ensure service availability and reliability
  • Collaborate with engineering teams
  • Design and maintain reliability tools
  • Participate in on-call rotation
  • Drive incident management
  • Define SRE practices
  • Mentor engineers
  • Optimize systems and workflows

Benefits

  • Competitive salary
  • Stock options
  • Health benefits
  • 401(k) plan
  • Flexible time off
  • Opportunities for growth
  • Inclusive culture
  • Real impact work