Software Engineer – GPU Networking, Distributed Systems

11 - 50 employees

Founded 2020

🤖 Artificial Intelligence

☁️ SaaS

🏢 Enterprise

💰 $8M Seed Round on 2022-04

Artificial Intelligence • SaaS • Enterprise

Baseten is a company that provides fast, scalable model inference services, designed for performance, security, and a delightful developer experience. They offer tools to streamline the entire development process, enabling high-throughput inference and fast deployment times. Baseten caters to enterprise companies by delivering robust, secure, and scalable model serving solutions, particularly useful for machine learning and AI model deployment. Their solutions allow organizations to efficiently manage model infrastructure while focusing on creating domain-specific models. Baseten supports open-source model packaging and offers autoscaling features to handle varying demand efficiently.

Software Engineer – GPU Networking, Distributed Systems

🕒 February 24

🏢🏡 San Francisco – Hybrid

💵 $150k - $250k / year

⏰ Full Time

🟡 Mid-level

🟠 Senior

🧑‍💻 Full-stack Engineer

🦅 H1B Visa Sponsor

Python

Apply Now

Find Similar Remote Jobs

📊 Check your resume score for this job

Improve your chances of getting an interview by checking your resume score before you apply.

Baseten

11 - 50 employees

Founded 2020

🤖 Artificial Intelligence

☁️ SaaS

🏢 Enterprise

💰 $8M Seed Round on 2022-04

Artificial Intelligence • SaaS • Enterprise

📋 Description

• Make RDMA First-Class: Integrate RDMA/RoCE/InfiniBand capabilities into our inference stack. • Optimize Distributed Inference: Implement and tune networking layers for Disaggregated KV Cache Offload and WideEP. • Enable Serverless-Grade Startup Speeds for LLMs: Work with checkpointing and storage for sub-10-second startup for models. • Deep-Dive into Hardware: Validate networking performance on bleeding-edge clusters and write acceptance tests. • Build Observability: Design tools to visualize packet flow and diagnose distributed system behaviors. • Optimize Kernels: Work with communication libraries (NCCL, NVSHMEM) and write custom kernels to overlap compute and data transfer.

🎯 Requirements

• Deep experience with high-performance networking protocols (InfiniBand, RoCE v2) and understand the physics of data movement. • Fluent in C++ or Python, with the ability to bridge the gap between high-level logic and hardware. • Deep understanding of the memory hierarchy in modern NVIDIA architectures (H100/Blackwell) and know how to optimize for it. • Experience with NCCL, NVSHMEM, and UCX is highly preferred. • Experience with GPUDirect Storage (GDS) or high-performance filesystems like Weka or 3FS. • Familiarity with TensorRT-LLM, vLLM, or Sglang is a plus. • Experience running low-level benchmarks to "qualify" new hardware clusters.

🏖️ Benefits

• Competitive compensation, including meaningful equity. • 100% coverage of medical, dental, and vision insurance for employee and dependents • Generous PTO policy including company wide Winter Break (our offices are closed from Christmas Eve to New Year's Day!) • Paid parental leave • Company-facilitated 401(k) • Exposure to a variety of ML startups, offering unparalleled learning and networking opportunities.

Apply Now

Similar Jobs

Software Engineer, ChatGPT Infrastructure

🕒 February 23

OpenAI

201 - 500

🤖 Artificial Intelligence

☁️ SaaS

🏢 Enterprise

Software Engineer specializing in infrastructure systems for ChatGPT. Develop core abstractions and tooling to support engineering teams in fast iterations.

🏢🏡 San Francisco – Hybrid

💵 $255k - $405k / year

⏰ Full Time

🟡 Mid-level

🟠 Senior

🧑‍💻 Full-stack Engineer

🦅 H1B Visa Sponsor

Distributed Systems

Software Engineer, Dashboard

🕒 February 19

Vercel

201 - 500

☁️ SaaS

🌐 Web 3

Software Engineer responsible for developing the Vercel Dashboard for user interaction and experience optimization. Working across the stack to build personalized, agent-powered surfaces for users.

🏢🏡 San Francisco – Hybrid

💵 $196k - $294k / year

💰 $150M Series D on 2021-11

⏰ Full Time

🟡 Mid-level

🟠 Senior

🧑‍💻 Full-stack Engineer

🦅 H1B Visa Sponsor

JavaScript

Next.js

React

TypeScript

Senior Software Engineer – Streaming Video

🕒 February 19

Aquabyte

11 - 50

🌾 Agriculture

🤖 Artificial Intelligence

Senior Backend Engineer developing systems for real-time video streaming, AI analysis, and industrial machinery control. Collaborating on cloud and edge systems focusing on reliability and security.

🏢🏡 San Francisco – Hybrid

💵 $140k - $170k / year

⏰ Full Time

🟠 Senior

🧑‍💻 Full-stack Engineer

AWS

Cloud

Distributed Systems

Docker

FFmpeg

IoT

Python

Software Engineer – Fullstack

🕒 February 19

OneCrew

1 - 10

🤝 B2B

Software Engineer at OneCrew designing and building AI-driven features for the construction industry. Collaborating with teams to develop reliable systems and improve operations.

🏢🏡 San Francisco – Hybrid

💵 $150k - $210k / year

⏰ Full Time

🟡 Mid-level

🟠 Senior

🧑‍💻 Full-stack Engineer

AWS

Node.js

Postgres

React

FullStack Software Engineer, Codex App

🕒 February 19

OpenAI

201 - 500

🤖 Artificial Intelligence

☁️ SaaS

🏢 Enterprise

FullStack Software Engineer developing systems for Codex desktop application and IDE extension at OpenAI. Building end-to-end features and ensuring usability, performance, and reliability.

🏢🏡 San Francisco – Hybrid

💵 $230k - $385k / year

⏰ Full Time

🟡 Mid-level

🟠 Senior

🧑‍💻 Full-stack Engineer

🦅 H1B Visa Sponsor

Electron

Node.js

Rust

TypeScript