Machine Learning Engineer, Model Evaluation

HybridFull Time$180K – $280K/yr🧠 Machine Learning

evaluationbenchmarkingred-teamingLLMAI safety

Job Description

Scale AI is looking for a Machine Learning Engineer to join our Model Evaluation team. You'll build systems and benchmarks to rigorously evaluate the capabilities and safety of frontier AI models for our enterprise and government clients.

You'll work on red-teaming, capability evaluations, and automated testing pipelines that help our clients understand what their models can and cannot do. This is a critical role in ensuring AI systems are deployed responsibly.

Requirements

3+ years of ML engineering experience
Strong Python skills
Experience with LLM evaluation and benchmarking
Familiarity with statistical analysis
Understanding of AI safety and alignment concepts
Experience with data pipelines and annotation systems is a plus

Benefits

Competitive salary and equity
Comprehensive health benefits
401(k) with matching
Flexible PTO
$2,000 annual learning stipend
Hybrid work flexibility

Opens company application page

Job Details

Posted: April 5, 2026
Expires: May 5, 2026
Views: 0
Applies: 0

About the Company

Scale AI

San Francisco, CA

Scale AI accelerates the development of AI applications by providing high-quality training data and evaluation infrastructure.

Similar Jobs

View all Machine Learning jobs →

Machine Learning Engineer, Pretraining

Anthropic

HybridFull Time$250K – $380K/yr13h ago

pretrainingLLMPyTorchJAX

Machine Learning

ML Engineer — GPT Fine-tuning Platform

OpenAI

San Francisco, CAFull Time$300K – $450K/yr1d ago

PythonPyTorchDistributed TrainingFSDP

Machine Learning

Machine Learning Engineer, Search & Retrieval

Perplexity AI

HybridFull Time$200K – $320K/yr13h ago

searchRAGretrievalvector databases

Machine Learning

Machine Learning Engineer, Open Source

Hugging Face

RemoteFull Time$160K – $240K/yr13h ago

open sourceTransformersPyTorchPEFT

Machine Learning