Overview
Research Engineer role focused on developing large language model training infrastructure.
Ideal candidate should have strong software engineering skills and experience in distributed systems.
315k usd / yearhybridEnglishPythonJAXPyTorchMLOpscloud computing
Locations
United States, California, San Francisco
Requirements
Bachelor's degree or equivalent experience Strong software engineering skills Experience in building distributed systems
Responsibilities
Design and implement ML training infrastructure Develop core ML framework primitives Create automated evaluation systems Implement monitoring tools Optimize data loading pipelines Collaborate with research teams Develop infrastructure for hyperparameter sweeps
Benefits
Collaborative office space