GPU Cloud Platform Engineer

Yotta Labs

Overview

Role involves designing, deploying, and operating large-scale GPU infrastructure for AI workloads.

Ideal candidate should have 5+ years in cloud-native development with strong Kubernetes experience.

remotemidfull-timeEnglishKubernetesDockerPrometheusGrafanaAWSGCPAzureHelm

Locations

Canada
Argentina
United States
Brazil
Mexico

Requirements

Bachelor's degree in relevant field
3+ years in system engineering or DevOps
5+ years in cloud-native development or AI engineering
2+ years in Kubernetes multi-cluster management
Familiarity with Kubernetes ecosystem
Proficient in Docker and containerization
Experience with monitoring tools like Prometheus and Grafana
Hands-on experience with cloud platforms like AWS, GCP, or Azure

Responsibilities

Build and operate large-scale GPU clusters
Conduct performance testing of GPU clusters
Deploy large models across multi-cluster environments
Participate in GPU cluster scheduling and optimization
Build a unified multi-cluster management system
Coordinate with IDC providers for GPU clusters

Benefits

Flexible remote work environment
Opportunity to work on cutting-edge technologies
Collaboration with experts from leading institutions
Visionary team aiming to redefine AI infrastructure