J
Jobs Base 0-to-1 builder jobs
317 active jobs 4 new today

Founding Engineer - Robotics Data Infrastructure

Neural Motion | California, United States | Yesterday
$150,000 – $220,000/yr| full-time | hybrid | lead
skills: python, go, distributed systems, data pipelines, streaming pipelines, microservices, kafka, pulsar, temporal, airflow, aws, gcp, s3, sqs, lambda, ecs, eks, robotics, ros, urdf, kinematics, fk/ik, coordinate frames, calibration, isaac gym, mujoco, robot learning, embodied ai, manipulation, vla, world model, imitation learning, rl, dataset design, multimodal datasets, simulation, apis, sdks, data loaders, mlops, data versioning, reproducibility

Neural Motion is an early-stage robotics startup building infrastructure for robotics datasets and cross-embodiment robot learning.

Today, robotics data is fragmented across embodiments, formats, and pipelines. This prevents models from learning shared priors and blocks the scaling we’ve seen in language and vision. Our goal is to fix this by building a universal data pipeline and cross-embodiment representation layer that unifies real-world logs, simulation, and multimodal datasets into a single, composable system.

This platform will power:

  • Public datasets and tooling for researchers
  • Data pipelines and sourcing infrastructure for enterprise robotics and AI labs
  • Cross-embodiment learning from large, real-world datasets

We are looking for a Founding Engineer to own and drive core parts of this system—from large-scale data pipelines to embodiment-aware transformations. As one of the earliest engineers, you will be at the forefront of the mission that allows knowledge learned in one robot to be reused across many, something that revolutionizes the physical AI world.

What You’ll Work On

You will operate at the intersection of data systems, cloud infrastructure, and robotics learning.

Core Areas
  • Design and build high-throughput data pipelines for ingesting, processing, and standardizing robotics datasets
  • Architect distributed systems and microservices for robotics data processing and dataset infrastructure
  • Develop the data compiler layer that standardizes raw logs into a unified representation
  • Build cross-embodiment transformation pipelines (retargeting, normalization, alignment)
  • Integrate multimodal augmentation models (vision, language, SLAM, simulation)
  • Enable real ↔ sim pipelines and unified evaluation frameworks
  • Build tooling for: dataset ingestion & validation, annotation and enrichment, and dataset versioning and reproducibility
Product Surfaces
  • Public dataset platform (APIs, SDKs, data loaders)
  • Internal pipelines for enterprise data sourcing and validation
  • Interfaces for model training and evaluation
Technical Ownership
  • Helping define the architecture for robotics dataset infrastructure and pipelines
  • Working directly with the founding team on product and technical direction
Research Collaboration

Neural Motion is actively exploring research directions in cross-embodiment robot learning and dataset representations. You will collaborate with robotics researchers working on these problems and help translate research results into practical infrastructure and tooling.

  • Integrating research outputs from robotics learning experiments into platform infrastructure
  • Supporting experiments around cross-embodiment datasets
Who You Are

We are open to two strong profiles, ideally combined:

(A) Infrastructure / Distributed Systems Engineer
  • Experience building large-scale data systems (TB–PB scale)
  • Strong background in:
  • distributed systems
  • streaming pipelines
  • microservices architecture
  • Hands-on with tools such as:
  • Kafka / Pulsar
  • Temporal / Airflow / workflow orchestration
  • AWS (S3, SQS, Lambda, ECS/EKS) / GCP equivalents
  • Experience designing robust, fault-tolerant pipelines
  • Strong backend engineering skills (Python, Go, or similar)
(B) Robotics / Robot Learning Engineer
  • Experience in robot learning / embodied AI / manipulation
  • Familiarity with:
  • VLA / world models
  • imitation learning / RL
  • dataset design for robotics
  • Strong understanding of:
  • kinematics (FK/IK)
  • retargeting across embodiments
  • coordinate frames and calibration
  • Experience working with:
  • ROS / URDF
  • simulation tools (Isaac Gym, MuJoCo, etc.)
  • Good intuition for what makes high-quality robotics data
Ideal Candidate
  • Driven by the mission to define new, foundational infrastructure for an entire field
  • Has experience in both infrastructure and robotics, or has worked closely across both
  • Thinks in systems: not just models or pipelines, but how everything composes
  • Cares deeply about data quality, structure, and scalability
  • Comfortable working in an ambiguous, fast-moving environment
Bonus Points
  • Experience with large multimodal datasets (video, sensor logs, etc.)
  • Experience with dataset platforms (HuggingFace, TFDS, RLDS, etc.)
  • Experience building internal tools for ML/data teams
  • Exposure to simulation ↔ real transfer systems
  • Startup or zero-to-one experience
Location / Setup

San Francisco/Remote

Compensation / Equity

Compensation for this role includes:

• $150,000 – $220,000/yr base salary + meaningful early-stage equity

Benefits

equity · health insurance