Member of Technical Staff - Infrastructure & LLMs
About this role
We’re looking for a Member of Technical Staff - Infrastructure & LLMs with deep curiosity and strong technical instincts to join us at the earliest stages of building an AI-
native inference platform. We’re rethinking how modern analytical workloads—such as structured extraction, classification, sentiment analysis, and multimodal question answering—run at scale, designing systems that spin up 100s of GPUs to process data efficiently, securely, and cost-effectively. This role is ideal for someone who thrives on hard technical problems and wants to work at the intersection of infrastructure and large language models. You’ll join a two-person full-time team with a few fractional contributors, working closely with the founder to build and own foundational systems from the ground up. You’ll have real ownership over performance-critical code and systems, from distributed job schedulers to secure multi-tenant inference infrastructure. Example projects might include scaling fault-tolerant inference workloads to trillions of tokens, building ultra-fast data caching and lookup pipelines, writing performance profilers and cost attribution models, or experimenting with novel LLM distillation techniques. We're not looking for a particular pedigree—we care most about your ability to learn fast,
build well, and go deep. This is a rare opportunity to shape a foundational AI platform early, contribute meaningfully to the technical roadmap, and grow into a key leader or
co-founder. The team is based in San Francisco and prefers in-person collaboration 2–4 days per week.
Ideal Candidate Profile
- 2+ years of experience as a software engineer across the full stack, with strong experience and knowledge in infrastructure
- Deep technical curiosity and strong infrastructure skills π§
- Exceptional builders; able to learn fast and ship well-crafted products π
- Open-source contributors, side projects, or personal AI work π»
- Self-motivated in LLMs, inference infra, or backend tooling
- Infra-heavy background (Supabase, Dagster Labs, Hex, Prefect, MotherDuck) or AI infra startups (Anyscale, Modal, Lightning AI)
Tech Stack / Skills
Core Requirements:
- Python π – SDK & backend development
- Distributed Systems β‘ – batch inference across 100+ GPUs
- CUDA / GPU orchestration π₯οΈ – large-scale LLM workloads
- Docker / Containerization π³
- Kubernetes or equivalent βΈοΈ
Bonus / Nice-to-Have:
- Security π
- Performance profiling & cost attribution πΉ
- Open-source contributions π
Salary - $170k - $220k
Equity - High target equity 1% - 3%
Visa sponsorship not available
Hybrid work policy
Full-time position
About company
We build large-scale AI inference infrastructure to increase human leverage, grow productivity, and enable discovery.
Team size
2 people
Founded
2023
Company locations
San Francisco, CA
About the team
Based in SF Bay Area; Typically 2–4 days in person; ~8–12 hours/day of work with some nights/weekends, but no face time culture ''Use all your gas, but don’t burn out”—values creative, thoughtful work over brute force hours Open to generous equity and founder-level responsibility over time
Tech stack
Python – Core programming language for the product (SDK and backend), Distributed Systems – Custom infra for batch inference across 100+ GPUs, CUDA / GPU
orchestration – Running large-scale LLM inference workloads efficiently, Docker / Containerization – Used for infra deployment and scaling, Kubernetes or equivalent – Relevant for managing distributed workloads at scale
Interview process
1 Initial Screen
2 Take-home Project
3 Review session
4 Cultural/values alignment