Back to Jobs
Hugging Face

Research Engineer, Efficient ML

Hugging Face
RemoteFull Time$150K – $230K/yr🧠 Machine Learning
quantizationPEFTmodel compressionefficient MLopen sourceremote

Job Description

Hugging Face is looking for a Research Engineer focused on efficient ML to work on model compression, quantization, and optimization techniques. You'll contribute to our open-source libraries (PEFT, bitsandbytes, optimum) and help make state-of-the-art models accessible to everyone — including those without access to expensive hardware.

Your work will directly impact millions of developers who rely on Hugging Face to run models efficiently.

Requirements

  • Strong ML engineering background
  • Experience with model quantization, pruning, or distillation
  • Proficiency in Python and PyTorch
  • Understanding of hardware architectures (GPU, CPU, edge devices)
  • Experience with ONNX or TensorRT is a plus
  • Open-source contribution experience preferred

Benefits

  • Fully remote
  • Competitive salary
  • Equity
  • Health benefits
  • Open-source impact

Job Details

Posted
April 5, 2026
Expires
May 5, 2026
Views
0
Applies
0

About the Company

Hugging Face

Hugging Face

New York, NY

The AI community building the future. The platform where the machine learning community collaborates on models, datasets, and applications.