J
Jobs Base 0-to-1 builder jobs
2,399 active jobs 24 new today
Optimal logo

Staff ML Research Engineer | LLM Fine-Tuning | RL | SFT | DPO

Optimal | London, United Kingdom | 6d ago
£210,000 – £305,000/yr| full-time | on-site | senior | 3–5 years
skills: llm fine-tuning, reinforcement learning, sft, dpo, ppo, grpo, instruction tuning, information retrieval, agent capability, software engineering, ml engineering, voice model, speech model, multimodal model, synthetic data generation, evaluation pipelines

Staff ML Research Engineer | LLM Fine-Tuning | RL | SFT | DPO | Must Have Startup Experience

Salary: £210,000 - £305,000 + meaningful equity

Contract: Permanent Start: ASAP

Working model: Full-time onsite (some flex around this) - Shoreditch, London

Eligibility: Must have existing UK work authorisation - Dependent visa is fine

What You'll Get

  • £210,000 – £305,000 + meaningful equity
  • Full medical, dental, and vision coverage
  • Uncapped holiday - take what you need, when you need it
  • Daily meals covered - breakfast, lunch, dinner, and snacks on site every day

Optimal has teamed up with an exciting AI company based in London that’s growing its research team, offering a rare opportunity to join at just the right time.

This is a company that has built its own proprietary models, has serious commercial traction, and is scaling fast. They're not wrapping third-party APIs and calling it AI. They've done the hard work - and it's paying off.

The research function sits at the core of everything. You'll be building AI systems that push the boundary of what's possible - designing and implementing state-of-the-art methods for instruction tuning,

information retrieval, and agent capability. Your work won't sit in a repo waiting for review. It'll ship, it'll scale, and it'll be used by real customers.

This is a small, elite team. Everyone here has high ownership, high impact, and zero time for passengers. If that sounds like your environment - keep reading.

🚨 Please only apply if you have ALL of the following 🚨

  • 5+ years in applied AI/ML engineering or research (exceptional candidates at 3+ considered)
  • Hands-on, production-grade experience fine-tuning LLMs (SFT, DPO, PPO, GRPO, RL)
  • Proven track record deploying models in live, customer-facing environments
  • Experience working with large open-source LLMs (e.g. Llama or similar)
  • Startup or scale-up experience - you've moved fast, owned outcomes, shipped real things
  • Strong software engineering fundamentals - this is not a data science or research-only role

ℹ️ Very Important Notes

  • This role requires deep applied ML engineering - not suitable for pure researchers or academic profiles
  • You must be comfortable in a fast-moving, high-ownership startup environment
  • You'll be expected to move from research idea → production system at pace
  • Onsite in Shoreditch, London - Fulltime (staff-level candidates have some flexibility)

Must-Haves

  • Production-grade LLM fine-tuning experience - SFT, RL, DPO, PPO, GRPO
  • Deep familiarity with large open-source language models
  • Strong software engineering skills - you write clean, bug-free ML code
  • Ability to break down ambiguous research problems into clear, shippable milestones
  • Startup mindset - evidence of high personal achievement and genuine ownership
  • Excited about applied product impact, not just foundation model or academic research

Bonus Experience

  • Voice and speech model experience
  • Multimodal model exposure
  • Experience generating synthetic data and building evaluation pipelines
  • Background from product-driven orgs (not just research labs)
  • Prior founding engineer or early-stage startup experience

What You'll Be Doing

Research & Model Development

  • Build and fine-tune models for complex, real-world customer-facing tasks
  • Run experiments with open-source LLMs to drive order-of-magnitude gains in latency and performance
  • Design and implement state-of-the-art instruction tuning and information retrieval methods

Production Ownership

  • Take research from prototype to fully deployed, production-grade system
  • Validate model behaviour against real-world workflows and user feedback
  • Improve reliability, capability, and performance of live AI systems

Collaboration

  • Work directly alongside science and engineering teams on new architectures
  • Feed real-world findings back into platform evolution and roadmap
  • Break down research ideas into clear, iterative milestones

What They're Looking For

  • A technically sharp, applied engineer with a founder-like mentality
  • Someone who thrives in ambiguity and moves with urgency
  • A builder who wants to ship things that matter, not just publish papers
  • An engineer who's energised by ownership, pace, and real-world impact

If you meet the requirements above and want to do some of the most impactful ML research work in London right now - get in touch for a fast response.

Benefits

equity · medical coverage · dental coverage · vision coverage · uncapped holiday · daily meals
Get new builder jobs daily: