Staff Site Reliability Engineer, Compute

Crusoe

Overview

Role focused on supporting virtualization and kernel-level performance for AI-first cloud infrastructure.

Ideal candidate should have 8+ years of experience in SRE or Linux system engineering with strong Linux kernel knowledge.

hybridseniorpermanentfull-timeLinuxKVMVMWareCGoRustCI/CD

Locations

United States, California, San Francisco

Requirements

8+ years of experience required
Strong proficiency in Linux kernel internals
Experience with virtualization technologies
Expert-level skills in C, Go, or Rust
Proficiency in Infrastructure as Code tooling
Strong understanding of compute scheduling
Experience with system-level debugging

Responsibilities

Develop automation and observability tools
Support and scale virtualization stack
Identify and resolve performance bottlenecks
Optimize performance for AI and HPC workloads
Participate in root cause analysis
Tune kernel subsystems
Implement support for emerging compute hardware

Benefits

Hybrid work schedule
Competitive pay
Restricted Stock Units
Health insurance options
Employer HSA contributions
Paid Parental Leave
401(k) match
Generous paid time off