Research Engineer, Efficient ML

RemoteFull Time$150K – $230K/yr🧠 Machine Learning

quantizationPEFTmodel compressionefficient MLopen sourceremote

Job Description

Hugging Face is looking for a Research Engineer focused on efficient ML to work on model compression, quantization, and optimization techniques. You'll contribute to our open-source libraries (PEFT, bitsandbytes, optimum) and help make state-of-the-art models accessible to everyone — including those without access to expensive hardware.

Your work will directly impact millions of developers who rely on Hugging Face to run models efficiently.

Requirements

Strong ML engineering background
Experience with model quantization, pruning, or distillation
Proficiency in Python and PyTorch
Understanding of hardware architectures (GPU, CPU, edge devices)
Experience with ONNX or TensorRT is a plus
Open-source contribution experience preferred

Benefits

Fully remote
Competitive salary
Equity
Health benefits
Open-source impact

Opens company application page

Job Details

Posted: April 5, 2026
Expires: May 5, 2026
Views: 0
Applies: 0

About the Company

Hugging Face

New York, NY

The AI community building the future. The platform where the machine learning community collaborates on models, datasets, and applications.

Similar Jobs

View all Machine Learning jobs →

Machine Learning Engineer, Pretraining

Anthropic

HybridFull Time$250K – $380K/yr11h ago

pretrainingLLMPyTorchJAX

Machine Learning

ML Engineer — GPT Fine-tuning Platform

OpenAI

San Francisco, CAFull Time$300K – $450K/yr1d ago

PythonPyTorchDistributed TrainingFSDP

Machine Learning

Machine Learning Engineer, Open Source

Hugging Face

RemoteFull Time$160K – $240K/yr11h ago

open sourceTransformersPyTorchPEFT

Machine Learning

Machine Learning Engineer, Model Evaluation

Scale AI

HybridFull Time$180K – $280K/yr11h ago

evaluationbenchmarkingred-teamingLLM

Machine Learning