J
Jobs Base
784 active jobs
Osmosis logo

Machine Learning Engineer

Osmosis | San Francisco, California, United States | 2mo ago
This role has closed. Here are similar open builder roles:
1.
Austin, Texas, United States | on-site | internship | internship | ai, agentic ai, llms | 3w ago
2.
South Portland, ME, United States | on-site | full-time | mid | ai systems design, route optimization, gps tracking | 3w ago
3.
Software Engineering Intern (Maritime Technology Startup (Stealth))
El Segundo, California, United States | $40 – $48/hr | on-site | internship | internship | python, go, javascript | 3w ago
4.
San Francisco, California, United States | $130,000 – $170,000/yr | on-site | full-time | lead | typescript, react, sql | 3w ago
5.
New York, New York, United States | on-site | full-time | mid | machine learning, devops, ci/cd | 3w ago
6.
Austin, Texas, United States | on-site | internship | internship | agentic ai, llms, agent frameworks | 3w ago
7.
Forward Deployed Engineer (Legion Intelligence)
Washington DC, United States | $185,000 – $260,000/yr | on-site | full-time | mid | python, javascript, typescript | 3w ago
8.
New York City, New York, United States | $185,000 – $245,000/yr | on-site | full-time | senior | typescript, nuxt, postgres | 3w ago
9.
San Francisco, California, United States+1 | $200,000 – $300,000/yr | on-site | full-time | senior | react, typescript, python | 3w ago
10.
San Francisco, United States+2 | $130,000 – $165,000/yr | on-site | full-time | senior | performance marketing, growth engineering, marketing operations | 3w ago
Original posting (closed) below
$180,000 – $250,000/yr| full-time | on-site | mid | 1+ years | visa sponsorship
skills: reinforcement learning, distributed training, python, fastapi, golang, react, typescript, next.js, aws fargate, docker, kubernetes, aws sagemaker, pytorch, fsdp, vllm, sglang, dynamodb, s3, low-level optimization

About Osmosis

At Osmosis, we help companies use cutting-edge reinforcement learning techniques to fine-tune open-source language models that beat foundation models on performance, latency, and cost. 

We’ve raised $7M in funding from Y Combinator, top institutional investors like CRV and Audacious Ventures, as well as angel investors including Paul Graham (Y Combinator), Erik Bernhardsson (Modal Labs), Misha Laskin (Reflection AI), and Guillermo Rauch (Vercel). 

About the Role

We're looking for a Machine Learning Engineer to contribute to high-performance distributed training infrastructure for RL at scale. You'll work directly with our founding team and design partners to push the boundaries of what's possible with post-training and continual learning systems.

This role requires expertise in RL algorithms, distributed training, and low-level optimization. You'll have exceptional agency to make impactful decisions while working in a fast-paced, customer-driven environment.

Responsibilities

You’ll contribute to work in areas like:

  • Distributed Training Infrastructure: implement new RL algorithms and build scalable post-training pipelines
  • Resource Management & Optimization: design infrastructure systems for efficient GPU utilization and dynamic resource allocation
  • Customer-Facing Work: work directly with customers on production deployments and custom model development

Technology

  • Backend: Python FastAPI, Golang
  • Frontend: React, TypeScript, Next.js
  • Cloud Infrastructure: AWS Fargate, Docker, Kubernetes, AWS SageMaker
  • ML Frameworks: Verl / slime / Megatron-LM / SkyRL, PyTorch (FSDP experience is a plus), vLLM / SGLang
  • Databases: DynamoDB, S3

Benefits

health insurance
Get new builder jobs daily: