Senior AI Engineer

DepoDirect | United States | 1mo ago

full-time | remote | senior

skills: python, typescript, ruby, rails, angular, google cloud platform, gcp, terraform, circleci, auth0, anthropic claude, google gemini, deepgram, assemblyai, prompt engineering, rag, vector databases, postgresql, firestore, supabase, streamlit, docker, opentelemetry, jest, rspec

apply →

DepoDirect

Senior AI Engineer

Full-time | Remote (US) | Reports to Lead Engineer

About the Role

---------------

DepoDirect is a deposition management platform used by litigation teams across the country to schedule, record, transcribe, and deliver depositions. We are in the middle of a strategic transformation from an operations platform into a deposition intelligence platform, and we need an AI engineer who can build the products that will define the next chapter of this company — while having the engineering depth to strengthen the distributed platform underneath them.

The immediate, highest-priority objective is clear: our distribution partner processes 300,000 depositions per year and is evaluating DepoDirect as their AI provider. Within the first 90 days, you will be building and shipping the AI-powered transcript intelligence features that plug directly into that pipeline. This is not a research role or an R&D sandbox. You will be writing production code that touches real depositions from day one.

You will be working directly alongside the founder and Lead Engineer. The engineering culture here is AI-forward: we use Claude Code as a primary development tool, we build with LLMs as first-class components, and we expect you to be fluent in that workflow. But we also need someone with real engineering depth, because when the AI tooling hits a wall, you need to know how to debug a Rails service, trace a GCP Cloud Function, or untangle a Pub/Sub pipeline without losing a day.

What You Will Be Building

---------------------------

The work spans the full lifecycle of a deposition—before, during, and after—and touches every layer of the stack. Here is how to think about the scope:

AI-powered APIs for high-volume transcript processing. We have a distribution partner that processes hundreds of thousands of depositions per year. Your first priority is building and hardening the production AI services that plug into that volume. Think document ingestion, LLM-driven analysis, and structured output delivery at scale.

Transcript intelligence and quality systems. We have existing pipelines that combine deterministic processing, LLM-powered contextual review, and audio analysis. You will extend and improve these systems, pushing accuracy higher and expanding what they can catch.

Retrieval and search over legal documents. Building the infrastructure that lets attorneys query across large bodies of testimony and case materials using natural language. This includes document parsing, embedding pipelines, vector storage, retrieval optimization, and evaluation frameworks to measure quality over time.

Attorney-facing AI products. We are building tools that help trial attorneys prepare for, conduct, and analyze depositions using AI. These products span preparation workflows, real-time assistance, and post-deposition review. You will help take them from early prototype to production-grade.

Technical Environment

------------------------

DepoDirect is a distributed microservices platform with a hub-and-spoke architecture. Syndicate is the central data hub, with event-driven communication via Google Cloud Pub/Sub. The codebase spans multiple repositories and services — you should be comfortable navigating a system of this breadth, not intimidated by it.

Platform: Ruby on Rails 7.1 APIs (API-only mode), TypeScript backends and Cloud Functions, Angular 19 frontends (Nx monorepo), Google Cloud Platform (Cloud Run, Cloud Functions, Pub/Sub, Cloud SQL, Cloud Storage), Terraform, CircleCI, Auth0

AI/ML Stack: Anthropic Claude, Google Gemini, Deepgram Nova-3, AssemblyAI, custom prompt engineering pipelines, document parsing (python-docx, pdfplumber), vector databases for RAG retrieval

AI Tooling: Python/Streamlit for internal AI tools, Supabase + Lovable for rapid prototyping, Claude Code for daily development workflow

Data Layer: PostgreSQL 14 (Cloud SQL) with UUID primary keys, Firestore for real-time document data, Cloud Storage for media/documents, JSON:API spec across all APIs

Backend Services: Rails 7.1 powers the core domain services — Syndicate (central data hub), Scribe (transcript processing), Zoom (video integration), Email, and Reporter portal. PostgreSQL 14 via Cloud SQL with UUID primary keys. Business logic follows the Interactor pattern, serialization follows JSON:API spec, and each monorepo pairs a Rails API with TypeScript Cloud Functions for event-driven processing (Pub/Sub listeners, webhook receivers).

Frontend: Angular 21 in an Nx monorepo with shared component libraries. Standalone components, @ngrx/signals for state management, Angular Material + PrimeNG for UI, Auth0 for auth, LaunchDarkly for feature flags, PostHog for analytics.

Google Cloud Platform: Cloud Run for backend services. Cloud Functions for event handlers and webhooks. Cloud SQL (PostgreSQL 14) and Firestore for data. Pub/Sub for async inter-service messaging. Secret Manager for secrets. IAM with OIDC for service-to-service auth. Cloud Armor for edge security.

Infrastructure as Code: Terraform with Terraform Cloud — every repo manages its own infrastructure via co-located infra/ directories, plus shared repos for databases, networking, DNS, and org-level resources. CircleCI with a custom base orb handles CI/CD. All services containerized with Docker and deployed to Cloud Run via Google Artifact Registry.

Observability: OpenTelemetry SDK with sidecar OTel Collector exporting to Cloud Trace and Google Managed Prometheus. Structured logging via Pino (TypeScript) and Rails Logger piped to Cloud Logging. Sentry for error tracking. PostHog for product analytics.

Auth: Auth0 for user authentication (JWT with RBAC claims), GCP OIDC for IAM-authenticated inter-service calls, API keys for select integrations.

What We Are Looking For

--------------------------

This is a senior-to-staff level role. We care about how you think, what you can ship, and whether you can own systems end-to-end in a small team. Here is the profile we are hiring for:

You are extremely organized. You can juggle multiple workstreams across different repositories and services without dropping threads. You keep your work visible — tickets updated, PRs scoped clearly, context documented — so that the rest of the team always knows where things stand. In a three-person engineering team, nobody can afford to be the person whose work is a black box.
You are low-ego and collaborative. In a small team, there is no room for territorial behavior or politics. You give and receive feedback openly, you pick up what needs to be done even when it is not glamorous, and you care more about the team shipping the right thing than about who gets credit. When someone has a better idea, you run with it.
You have built and deployed AI-powered features in production, not just prototypes. You understand prompt engineering, chunking strategies, context window management, and the operational realities of LLM-based systems (latency, cost, reliability).
You are fluent with AI-assisted coding tools, particularly Claude Code. You use AI to move faster, but you understand what the AI is generating well enough to debug it, refactor it, and make architectural decisions the AI cannot.
You have designed, built, and operated distributed systems in production. You understand service-to-service communication patterns (sync HTTP, async messaging), eventual consistency, and the failure modes that come with distributed architectures. You have meaningful experience with Google Cloud Platform — or equivalent cloud infrastructure. You can write and reason about Terraform, read a Terraform plan, and make infrastructure changes alongside application code.
You are comfortable with Python for data/ML work and TypeScript or Ruby for application code. We are not looking for someone who only knows one language.
You can work with LLM APIs (Anthropic, Google, OpenAI) and speech-to-text APIs (Deepgram, AssemblyAI) to build real products, not toy demos.
You are self-directed and can take a product requirement, break it into engineering tasks, and ship it without someone managing your day-to-day. This is a three-person engineering team. There is no tech lead buffer between you and production.
You communicate clearly about trade-offs, blockers, and progress. When you hit a wall, you say so early rather than spinning. You can make architectural decisions and articulate the trade-offs clearly. You can review code, provide constructive feedback, and raise the quality bar for the team.
You have hands-on experience building retrieval-augmented generation systems — document parsing and chunking, embedding generation, vector store selection and tuning, hybrid retrieval strategies, re-ranking, and evaluation frameworks for measuring retrieval quality and answer accuracy. You know how to iterate on a RAG pipeline using real-world evaluation data rather than vibes.
You understand observability: structured logging, distributed tracing (OpenTelemetry), error tracking, and how to use these tools to diagnose production issues across service boundaries. You can reason about database performance — query optimization, indexing strategies, and the trade-offs between relational (PostgreSQL) and document (Firestore) data stores.
You write tests — not as an afterthought, but as part of how you work. RSpec for Rails, Jest for TypeScript, request specs for API integration testing. You are comfortable with CI/CD pipelines, Docker, and the deployment lifecycle from local development through staging to production.

Benefits

health insurance

Get new builder jobs daily: