Software Engineer – Voice AI, Inference Runtime

🕒 April 24

Apply Now
Find Similar Remote Jobs

📊 Check your resume score for this job

Improve your chances of getting an interview by checking your resume score before you apply.

Logo of Baseten

Baseten

WebsiteLinkedIn

11 - 50 employees

Founded 2020

🤖 Artificial Intelligence

☁️ SaaS

🏢 Enterprise

💰 $8M Seed Round on 2022-04

Artificial Intelligence • SaaS • Enterprise

Baseten is a company that provides fast, scalable model inference services, designed for performance, security, and a delightful developer experience. They offer tools to streamline the entire development process, enabling high-throughput inference and fast deployment times. Baseten caters to enterprise companies by delivering robust, secure, and scalable model serving solutions, particularly useful for machine learning and AI model deployment. Their solutions allow organizations to efficiently manage model infrastructure while focusing on creating domain-specific models. Baseten supports open-source model packaging and offers autoscaling features to handle varying demand efficiently.

📋 Description

• Own and lead Voice AI product areas end-to-end — from architecture and system design through implementation, rollout, and long-term production operations. • Design, build, and operate real-time, large-scale, high-performance model serving systems for STT, TTS, and voice agent workloads with clear SLOs for mission-critical customer deployments. • Drive cross-team collaboration with sister engineering teams to solve full-stack technical problems, aligning on priorities, and coordinating end-to-end delivery across the product surface area. • Mentor teammates through code reviews, design docs, and technical leadership.

🎯 Requirements

• Bachelor's degree or higher in Computer Science or related field • Proven track record owning production-grade real-time, large-scale systems where tail latency (p99) matters. • Proficient coding abilities in one or more popular programming or scripting languages; Python proficiency is a plus. • Good taste in product, particularly developer-oriented tools • Interest in ML/AI infrastructure and willingness to learn • Strong collaboration and communication skills • Comfortable using AI coding assistants (e.g., Claude Code, Codex, Cursor) as a daily productivity multiplier — as an AI-native company, we see this as a must-have skill. • Experience implementing pipeline-level model runtime optimizations such as dynamic batching, async scheduling, or decode-side throughput improvements. (Nice to have) • Experience building developer platforms: SDKs, CLIs, APIs, and self-serve workflows for ML or infrastructure products. (Nice to have) • Experience with containerization and orchestration technologies (Docker, Kubernetes), service meshes, or distributed scheduling. (Nice to have) • Familiarity with speech/audio ML models (STT, TTS, speech-to-speech) (Nice to have) • Familiarity with model-serving runtimes (vLLM, TensorRT, ONNX). (Nice to have) • Familiarity with systems-level performance profiling across host-device boundaries (e.g. PyTorch Profiler), diagnosing GPU utilization issues (Nice to have) • Exposure to customer-facing engineering: pre-sales prototyping, technical discovery, or working directly with customers to ship solutions. (Nice to have)

🏖️ Benefits

• Competitive compensation, including meaningful equity. • 100% coverage of medical, dental, and vision insurance for employee and dependents • Flexible PTO policy including company wide Winter Break (our offices are closed from Christmas Eve to New Year's Day!) • Paid parental leave • Fertility and family-building stipend through Carrot • Company-facilitated 401(k) • Exposure to a variety of ML startups, offering unparalleled learning and networking opportunities.

Apply Now

Similar Jobs

🕒 April 23

JiffyShirts.com

11 - 50

🛍️ eCommerce

🛒 Retail

👗 Fashion

WebsiteLinkedIn

Software Development Engineer II orchestrating AI coding agents to deliver features at Jiffy’s e-commerce platform. Lead the charge in automating production software using cutting-edge technologies.

🏢🏡 San Francisco – Hybrid

💵 $170k - $220k / year

⏰ Full Time

🟡 Mid-level

🟠 Senior

🧑‍💻 Full-stack Engineer

Apollo

Cloud

GraphQL

Java

Kubernetes

MySQL

Postgres

Python

React

RSpec

Ruby

Ruby on Rails

SQL

TypeScript

Go

🕒 April 23

Notion

501 - 1000

☁️ SaaS

⚡ Productivity

🤖 Artificial Intelligence

WebsiteLinkedIn

Software Engineer building innovative database features for Notion, collaborating across teams in San Francisco and New York City.

TypeScript

🕒 April 23

Adobe

10,000+ employees

WebsiteLinkedIn

Software Development Engineer developing search and discovery features for Adobe Stock. Leading and mentoring engineering teams in complex web applications at Adobe.

GraphQL

JavaScript

Node.js

React

TypeScript

🕒 April 23

Mach Industries

11 - 50

WebsiteLinkedIn

Software Engineer developing software for autonomous defense platforms at Mach Industries. Focus on designing and optimizing mission-critical applications in a hybrid work setting.

🏢🏡 San Francisco – Hybrid

💵 $125k - $220k / year

⏰ Full Time

🟡 Mid-level

🟠 Senior

🧑‍💻 Full-stack Engineer

Python

Rust

🕒 April 22

Benchling

501 - 1000

☁️ SaaS

🧬 Biotechnology

🤝 B2B

WebsiteLinkedIn

Software Engineer embedding cutting-edge scientific AI models into Benchling to help scientists design better molecules and to build a scalable platform for scientific models.

🏢🏡 San Francisco – Hybrid

💵 $173.1k - $234.4k / year

💰 $100M Series F - Benchling on 2021-11

⏰ Full Time

🟡 Mid-level

🟠 Senior

🧑‍💻 Full-stack Engineer

🦅 H1B Visa Sponsor

info