J
Jobs Base 0-to-1 builder jobs
2,399 active jobs 24 new today
SureBright logo

AI Engineer (Agentic Systems)

SureBright | Gurugram, Haryana, India+1 | 2w ago
locations: Gurugram, Haryana, India · Delhi, India
₹1,500,000 – ₹3,000,000/yr| full-time | on-site | senior | 1+ years
skills: llm, rag, vector databases, prompt engineering, fine-tuning, python, typescript, postgres, aws, azure, mlops, observability, ci/cd, tool calling, state machines, memory management, audit trails, human-in-the-loop, document understanding, extraction, classification, anomaly detection, fraud detection, decision support, sla routing, api design, testing, performance tuning, secure design, model optimization, latency optimization, cost optimization, caching, throughput tuning, embeddings, retrieval, reranking, hallucination mitigation, regression testing, drift detection, red teaming, structured outputs

This is a high-ownership “do whatever it takes” role for someone who wants to operate at founder speed, learn the full stack of an insurance/warranty business, and ship work that directly moves revenue, conversion, and retention.

What you’ll do
You will build the agentic layer of our core product: AI systems that reason, take actions, and reliably complete workflows across pricing/underwriting, policy issuance, claims intake, adjudication, fulfillment (repair/replacement/reimbursement), and other parts of the bueinsess.


Key responsibilities

  • Design and ship production-grade AI agents that run real business processes (not demos)
  • Build agentic architectures: orchestration, tool calling, state machines, memory, permissions, audit trails, human-in-the-loop, and fallback paths
  • Own our RAG platform end-to-end: ingestion, chunking, embeddings, retrieval, reranking, citations/grounding, and hallucination mitigation
  • Build evaluation and monitoring systems: offline eval sets, regression tests, online metrics, drift detection, and red-team suites
  • Implement model optimization: prompt systems, structured outputs, fine-tuning where appropriate, latency/cost optimization, caching, and throughput tuning
  • Build core ML systems for warranty/claims: document understanding, extraction, classification, anomaly/fraud signals, decision support, and SLA routing
  • Partner tightly with product/ops to translate real workflows into deterministic, testable, compliant automation


What you’ll build (examples)

  • Underwriting/pricing agents: real-time quote decisions using merchant/product/context signals with strict guardrails and auditability
  • Claims copilot + auto-adjudication engine: intake triage, evidence requests, decision proposals with explanation, vendor routing, reimbursement automation
  • OEM warranty parsing system: turn messy manufacturer policies into machine-readable coverage logic
  • Internal ops copilots: tooling that reduces manual work and increases consistency across customer support, compliance, and finance

**Requirements (must have)**
(Hiring at different levels for the same role - required experience years, expected skill level will vary as per role level)

  • 1+ years building and shipping ML/LLM systems in production (or equivalent founder-level experience)
  • Proven experience building agentic products/companies: multi-step workflows, tool use, orchestration, reliability engineering
  • Deep hands-on expertise in:
    • RAG and retrieval systems (vector databases, reranking, grounding strategies)
    • LLM evals (golden sets, automated judging, human eval, regression pipelines)
    • Prompting and structured outputs (schemas, function/tool calling, robustness)
    • Model training/fine-tuning fundamentals and tradeoffs (when to tune vs prompt vs retrieve)
  • Strong software engineering: clean APIs, testing, observability, performance tuning, secure-by-default design
  • Comfortable owning ambiguous problems end-to-end and driving them to measurable outcomes


Strong preference (nice to have)

  • Experience building systems with compliance/audit requirements (fintech/insurance/health/enterprise)
  • Experience with document AI at scale (PDFs, images, messy inputs), and extracting structured truth reliably
  • Experience designing human-in-the-loop workflows and escalation rules for high-stakes decisions
  • Experience with infra for LLMs: model hosting, batching, streaming, caching, prompt/version management
  • Startup or ex-founder background, especially shipping 0→1 products fast


What success looks like (first 90 days)

  • You ship an agentic workflow that replaces meaningful manual ops work and improves a measurable metric (cycle time, accuracy, cost per claim, attach rate, CSAT)
  • You implement an eval harness that catches regressions before production and gives us a reliable “quality score” per workflow
  • You establish a scalable architecture pattern for agents (permissions, audit logs, observability, fallbacks) that the team can replicate


Tech environment
We’re cloud-native and move fast. Expect Python for ML/agents, TypeScript for product surfaces, Postgres for systems of record, event-driven services, and a modern LLM + retrieval stack with strong observability and CI/CD. And AWS+Azure for infra.

Why this role is special

  • Build an AI-native category-defining company in a massive market
  • Direct founder exposure and high leverage: your work will change the trajectory of the company
  • Real breadth: growth + underwriting/claims ops + product, in one seat
  • Career accelerant: if you perform, your scope and title will grow quickly

How to Apply

  • Please ensure your profile is up to date and includes a link to your LinkedIn.
  • In your application message, share 3 things you’ve built or delivered with the results you achieved in one simple sentence per example (3 sentences total).

Benefits

health insurance
Get new builder jobs daily: