Staff ML Research Engineer | LLM Fine-Tuning | RL | SFT | DPO

Optimal | London, United Kingdom | 1mo ago

£210,000 – £305,000/yr| full-time | on-site | senior | 3–5 years

skills: llm fine-tuning, reinforcement learning, sft, dpo, ppo, grpo, instruction tuning, information retrieval, agent capability, software engineering, ml engineering, voice model, speech model, multimodal model, synthetic data generation, evaluation pipelines

apply →

Staff ML Research Engineer | LLM Fine-Tuning | RL | SFT | DPO | Must Have Startup Experience

Salary: £210,000 - £305,000 + meaningful equity

Contract: Permanent Start: ASAP

Working model: Full-time onsite (some flex around this) - Shoreditch, London

Eligibility: Must have existing UK work authorisation - Dependent visa is fine

What You'll Get

£210,000 – £305,000 + meaningful equity
Full medical, dental, and vision coverage
Uncapped holiday - take what you need, when you need it
Daily meals covered - breakfast, lunch, dinner, and snacks on site every day

Optimal has teamed up with an exciting AI company based in London that’s growing its research team, offering a rare opportunity to join at just the right time.

This is a company that has built its own proprietary models, has serious commercial traction, and is scaling fast. They're not wrapping third-party APIs and calling it AI. They've done the hard work - and it's paying off.

The research function sits at the core of everything. You'll be building AI systems that push the boundary of what's possible - designing and implementing state-of-the-art methods for instruction tuning,

information retrieval, and agent capability. Your work won't sit in a repo waiting for review. It'll ship, it'll scale, and it'll be used by real customers.

This is a small, elite team. Everyone here has high ownership, high impact, and zero time for passengers. If that sounds like your environment - keep reading.

🚨 Please only apply if you have ALL of the following 🚨

5+ years in applied AI/ML engineering or research (exceptional candidates at 3+ considered)
Hands-on, production-grade experience fine-tuning LLMs (SFT, DPO, PPO, GRPO, RL)
Proven track record deploying models in live, customer-facing environments
Experience working with large open-source LLMs (e.g. Llama or similar)
Startup or scale-up experience - you've moved fast, owned outcomes, shipped real things
Strong software engineering fundamentals - this is not a data science or research-only role

ℹ️ Very Important Notes

This role requires deep applied ML engineering - not suitable for pure researchers or academic profiles
You must be comfortable in a fast-moving, high-ownership startup environment
You'll be expected to move from research idea → production system at pace
Onsite in Shoreditch, London - Fulltime (staff-level candidates have some flexibility)

Must-Haves

Production-grade LLM fine-tuning experience - SFT, RL, DPO, PPO, GRPO
Deep familiarity with large open-source language models
Strong software engineering skills - you write clean, bug-free ML code
Ability to break down ambiguous research problems into clear, shippable milestones
Startup mindset - evidence of high personal achievement and genuine ownership
Excited about applied product impact, not just foundation model or academic research

Bonus Experience

Voice and speech model experience
Multimodal model exposure
Experience generating synthetic data and building evaluation pipelines
Background from product-driven orgs (not just research labs)
Prior founding engineer or early-stage startup experience

What You'll Be Doing

Research & Model Development

Build and fine-tune models for complex, real-world customer-facing tasks
Run experiments with open-source LLMs to drive order-of-magnitude gains in latency and performance
Design and implement state-of-the-art instruction tuning and information retrieval methods

Production Ownership

Take research from prototype to fully deployed, production-grade system
Validate model behaviour against real-world workflows and user feedback
Improve reliability, capability, and performance of live AI systems

Collaboration

Work directly alongside science and engineering teams on new architectures
Feed real-world findings back into platform evolution and roadmap
Break down research ideas into clear, iterative milestones

What They're Looking For

A technically sharp, applied engineer with a founder-like mentality
Someone who thrives in ambiguity and moves with urgency
A builder who wants to ship things that matter, not just publish papers
An engineer who's energised by ownership, pace, and real-world impact

If you meet the requirements above and want to do some of the most impactful ML research work in London right now - get in touch for a fast response.

Benefits

equity · medical coverage · dental coverage · vision coverage · uncapped holiday · daily meals

Get new builder jobs daily: