AI / Eval Engineer

Meet Life Sciences | San Francisco, California, United States | 2mo ago

This role has closed. Here are similar open builder roles:

1.	AI Builder Intern - Agentic AI (Mondee) Austin, Texas, United States \| on-site \| internship \| internship \| ai, agentic ai, llms \| 1mo ago
2.	AI Systems Weirdo (UniteGPS) South Portland, ME, United States \| on-site \| full-time \| mid \| ai systems design, route optimization, gps tracking \| 1mo ago
3.	Software Engineering Intern (Maritime Technology Startup (Stealth)) El Segundo, California, United States \| $40 – $48/hr \| on-site \| internship \| internship \| python, go, javascript \| 1mo ago
4.	AI Builder Intern - Agentic AI (Tabhi) Austin, Texas, United States \| on-site \| internship \| internship \| agentic ai, llms, agent frameworks \| 1mo ago
5.	Senior Founding Engineer (Ambral) New York City, New York, United States \| $185,000 – $245,000/yr \| on-site \| full-time \| senior \| typescript, nuxt, postgres \| 1mo ago
6.	Marketing Productivity Engineer (Sigma Computing) San Francisco, United States+2 \| $130,000 – $165,000/yr \| on-site \| full-time \| senior \| performance marketing, growth engineering, marketing operations \| 1mo ago
7.	Software Engineer (Cognition) San Francisco, California, United States \| From $260,000/yr \| on-site \| full-time \| mid \| python, distributed systems, ai \| 1mo ago
8.	Intelligence Architect (Basis) New York, New York, United States \| $150,000 – $225,000/yr \| on-site \| full-time \| senior \| applied machine learning, natural language processing, system design \| 1mo ago
9.	Senior GNC Engineer (Inversion) Playa Vista, California, United States \| $139,000 – $199,000/yr \| on-site \| full-time \| senior \| kalman filtering, sensor fusion, state estimation \| 1mo ago
10.	Forward Deployed Engineer (Stuut) New York City, New York, United States \| $150,000 – $240,000/yr \| on-site \| full-time \| senior \| python, apis, etl \| 1mo ago

browse all open builder jobs →

Original posting (closed) below

full-time | on-site | mid

skills: llms, search

Our client is a small team building technology that puts better clinical information in the hands of healthcare professionals — at a scale that belies the size of the team behind it. They've grown fast, they operate lean by design, and they're now building out the founding team that will shape the next decade of the company.

If you're the kind of person who can't leave a broken system alone, who goes three layers deeper than the obvious answer, who finds the pattern inside the failure — this was written for you.

What You'll Work On

Own the System. Every layer is yours — model behavior, retrieval, ranking, the smallest parameters. You'll figure out which ones actually move the needle, and move them.
Build the Feedback Loop. Measure what matters. Design the frameworks that tell you not just when the system fails, but why — and what it means for the next iteration.
Run the Experiments. Break things on purpose. Test the edges. Come back with something better. Rinse, repeat, ship.
Validate in the Real World. Work with leading healthcare institutions to stress-test performance across specialties, clinicians and clinical settings. See what holds up when it really counts.
Work with the Founders. No middle layers. You'll work directly with the people building this — on the problems that matter most.

What You Bring

Deep technical grounding in LLMs, search, or similar systems
An obsession with understanding failures, not just fixing them
The ability to move fast and think carefully at the same time
Comfort in ambiguity — you make your own path when there isn't one
High standards that don't slip when the pressure is on

Details Based in San Francisco. Remote considered for the right person, with relocation support available. Competitive salary, meaningful equity, full-time.

Interested? Apply and I'll be in touch if there's a fit. We work with a number of exciting clients across the industry — so even if this one isn't quite right, it is worth the chat!

Get new builder jobs daily: