Site Reliability Engineer

AION

Overview

Role focused on building reliable, scalable infrastructure for AI cloud platform.

Ideal candidate should have 3-8 years of experience in SRE or DevOps with expertise in cloud-native systems.

Individuals in this role are expected to relocate to Bangalore, though exceptions can be made

hybridmidfull-timeEnglishAWSGCPAzureKubernetesTerraformPrometheusGrafanaELKBashPythonGoDocker

Locations

  • India, Karnataka, Bengaluru

Requirements

  • 3-8 years of experience in SRE or DevOps
  • Strong experience with Terraform or similar IaC tools
  • Deep expertise with AWS, GCP, or Azure

Responsibilities

  • Design and implement monitoring and alerting systems
  • Develop automation for infrastructure provisioning
  • Create and maintain runbooks for operational scenarios
  • Implement service mesh solutions
  • Design logging systems for visibility
  • Responsible for capacity planning
  • Implement CI/CD pipelines
  • Design self-healing systems

Benefits

  • Competitive salary and benefits package
  • Flexible work environment
  • Opportunities for professional growth