Senior Engineer 2 – Inference Data Plane

🕒 March 17

☕ Washington – Remote

info

💵 $167.2k - $209k / year

⏰ Full Time

🟠 Senior

🧑‍💻 Full-stack Engineer

🦅 H1B Visa Sponsor

info
Apply Now
Find Similar Remote Jobs

📊 Check your resume score for this job

Improve your chances of getting an interview by checking your resume score before you apply.

Logo of DigitalOcean

DigitalOcean

1001 - 5000 employees

Founded 2011

☁️ SaaS

SaaS • Cloud Computing

DigitalOcean is a cloud infrastructure provider that offers a suite of products and services for developers to build, deploy, and scale applications. Their platform provides comprehensive tutorials, reference material, and support documentation to assist users in managing resources effectively using their API and CLI tools. With features like Droplets (virtual machines), managed databases, Kubernetes, and a marketplace for third-party applications, DigitalOcean focuses on simplicity and performance. They cater to both individual developers and larger organizations looking for cloud solutions that are easy to implement and manage.

📋 Description

• Act as a technical leader on the team, driving the end-to-end design, development, and delivery of critical data plane components hosting large generative AI models. • Architect and refine system design proposals for our high-scale, multi-tenant AI inference cloud ecosystem, ensuring they meet rigorous availability and resiliency standards. • Implement and optimize distributed inference hosting using techniques like tensor/data parallelism, KV cache optimizations, and smart routing. • Work cross-functionally with Product Managers, customer-facing teams, and other engineering teams to align technical roadmaps with customer needs. • Coach and mentor junior engineers, fostering a culture of technical excellence and continuous improvement. • Maintain and operate critical, high-scale services, utilizing observability tools and defining SLOs to ensure superior platform health.

🎯 Requirements

• Strong experience with microservices, messaging systems, databases, and infrastructure as code. • Hands-on experience hosting large language or multimodal models using inference engines like vLLM, SGLang, or Modular. • Familiarity with distributed inference serving frameworks such as llm-d, NVIDIA Dynamo, or Ray Serve. • Understanding of GPU-level optimization and experience with interconnect technologies like NVlink, XGMI, or RoCE. • Knowledge of common LLM architectures and optimization techniques (e.g., continuous batching, quantization). • Expert-level proficiency in GoLang or Python and familiarity with gRPC. • Proven experience shipping customer-facing software products and running critical services in a high-scale environment similar to DigitalOcean. • Experience integrating and building with open-source software.

🏖️ Benefits

• Employee Assistance Program • Local Employee Meetups • Flexible time off policy • Reimbursement for relevant conferences, training, and education • Access to LinkedIn Learning's 10,000+ courses

Apply Now

Similar Jobs

🕒 March 17

Cornelis Networks

51 - 200

🤖 Artificial Intelligence

🔧 Hardware

🏢 Enterprise

Senior Software Engineer designing and optimizing AI communication middleware at Cornelis Networks. Collaborating on performance-critical projects in a remote position for U.S. residents.

🕒 March 17

Clever Real Estate

51 - 200

🏠 Real Estate

🏪 Marketplace

👥 B2C

Full Stack Software Engineer developing backend systems at Clever, a real estate technology company. Shaping the future of the industry through innovative solutions and collaboration.

🕒 March 16

Fingerprint

51 - 200

🔒 Cybersecurity

🔌 API

☁️ SaaS

Full Stack Engineer for Fingerprint developing a dashboard for fraud detection. Lead front-end and back-end development tasks in a remote, collaborative environment.

🕒 March 16

Greenlight

201 - 500

💳 Fintech

📚 Education

👥 B2C

Senior Software Engineer Full-Stack working on Greenlight's Web Registration flow and maintaining service stability in the fintech app for families.

🕒 March 16

RevenueCat

51 - 200

☁️ SaaS

🔌 API

🤝 B2B

AI Product Engineer at RevenueCat developing agentic features across the platform. Collaborating with leadership to innovate subscription solutions for developers.