Writer

Platform Engineer, MLOps

Writer

Overview

Role involves deploying and managing infrastructure for AI/ML operations and collaborating with engineers to develop CI/CD pipelines.

Ideal candidate has professional experience with model training and large-scale ML systems, and is skilled in troubleshooting complex systems.

hybridfull-timeEnglishDockerKubernetesPyTorchTerraformPythonbashGoogle CloudAWSAzuregitGitHubPrometheusGrafana

Locations

  • United States, California, San Francisco

Requirements

  • 5+ years building core infrastructure
  • Experience running inference clusters at scale
  • Experience operating orchestration systems such as Kubernetes at scale

Responsibilities

  • Design and deploy CI/CD pipeline
  • Manage monitoring, logging, and alerting systems
  • Ensure training environments are available
  • Develop containerization and orchestration systems
  • Operate large Kubernetes clusters
  • Improve software solutions reliability
  • Measure and optimize system performance
  • Provide operational support for software applications

Benefits

  • Generous PTO
  • Medical, dental, and vision coverage
  • Paid parental leave
  • Fertility and family planning support
  • Flexible spending account
  • Health savings account
  • Annual work-life stipends
  • Company stock options