Senior Software Engineer – Container and Cloud Infrastructure

September 19

Apply Now
Logo of NVIDIA

NVIDIA

Artificial Intelligence • Gaming • Automotive

NVIDIA is a leading technology company specializing in accelerated computing and artificial intelligence. NVIDIA pioneers advancements in graphical processing units (GPUs), cloud computing, data centers, and virtual reality, with a focus on gaming, automotive, healthcare, and robotics industries. The company's innovations, such as NVIDIA Omniverse, transform traditional digital processes by enabling high-fidelity simulations and rendering tasks. Their applications span various industries, from autonomous vehicles using NVIDIA DRIVE to healthcare solutions with NVIDIA Clara, and AI-driven analytics and workflows.

10,000+ employees

Founded 1993

🤖 Artificial Intelligence

🎮 Gaming

📋 Description

• Design, build, and harden containers for NIM runtimes, inference backends; enable reproducible, multi-arch, CUDA-optimized builds. • Develop Python tooling and services for build orchestration, CI/CD integrations, Helm/Operator automation, and test harnesses; enforce quality with typing, linting, and unit/integration tests. • Help design and evolve Kubernetes deployment patterns for NIMs, including GPU scheduling, autoscaling, and multi-cluster rollouts. • Optimize container performance: layer layout, startup time, build caching, runtime memory/IO, network, and GPU utilization; instrument with metrics and tracing. • Evolve the base image strategy, dependency management, and artifact/registry topology. • Collaborate across research, backend, SRE, and product teams to ensure day-0 availability of new models. • Mentor teammates; set high engineering standards for container quality, security, and operability. • Build enterprise-grade software and tooling for container build, packaging, and deployment; improve reliability, performance, and scale across thousands of GPUs. • Support disaggregated LLM inference and emerging deployment patterns.

🎯 Requirements

• 10+ years building production software with a strong focus on containers and Kubernetes. • Strong Python skills building production-grade tooling/services • Experience with Python SDKs and clients for Kubernetes and cloud services • Expert knowledge of Docker/BuildKit, containerd/OCI, image layering, multi-stage builds, and registry workflows. • Deep experience operating workloads on Kubernetes. • Strong understanding on LLM inference features, including structured output, KV-cache, and LoRa adapter • Hands-on experience building and running GPU workloads in k8s, including NVIDIA device plugin, MIG, CUDA drivers/runtime, and resource isolation. • Excellent collaboration and communication skills; ability to influence cross-functional design. • A degree in Computer Science, Computer Engineering, or a related field (BS or MS) or equivalent experience. • Expertise with Helm chart design systems, Operators, and platform APIs serving many teams (preferred). • Experience with OpenAI API, Hugging Face API as well as understanding different inference backends (vLLM, SGLang, TRT-LLM) (preferred). • Background in benchmarking and optimizing inference container performance and startup latency at scale (preferred). • Prior experience designing multi-tenant, multi-cluster, or edge/air-gapped container delivery (preferred). • Contributions to open-source container, k8s, or GPU ecosystems (preferred).

🏖️ Benefits

• Competitive salaries • Generous benefits package • Eligible for equity

Apply Now

Similar Jobs

September 16

3M Consultancy

2 - 10

🤝 B2B

🎯 Recruiter

3M Consultancy Cloud Application Developer building and deploying serverless AWS applications for IRS systems, requiring IRS MBI clearance and DevOps expertise.

🇺🇸 United States – Remote

⏰ Full Time

🟡 Mid-level

🟠 Senior

☁️ Cloud Engineer

September 12

Vultr

51 - 200

🤖 Artificial Intelligence

Senior Engineer building and evolving Vultr's cloud networking platform. Architect and implement scalable VPC, routing, and security features.

🇺🇸 United States – Remote

💵 $128k - $145k / year

⏰ Full Time

🟠 Senior

☁️ Cloud Engineer

September 12

CDIT LLC

51 - 200

🤝 B2B

☁️ SaaS

Lead migration from PowerCenter to IICS, build ETL pipelines, and support analytics for government-focused IT services.

🇺🇸 United States – Remote

⏰ Full Time

🟠 Senior

🔴 Lead

☁️ Cloud Engineer

September 10

KVP Business Solutions Pvt Ltd

51 - 200

🤝 B2B

🏢 Enterprise

Salesforce Marketing Cloud Developer managing and optimizing effective digital marketing campaigns. Collaborating with teams to enhance customer engagement with Salesforce Marketing Cloud tools.

🇺🇸 United States – Remote

⏰ Full Time

🟡 Mid-level

🟠 Senior

☁️ Cloud Engineer

September 10

Smile Digital Health

201 - 500

⚕️ Healthcare Insurance

☁️ SaaS

🏢 Enterprise

Cloud Architect designing and securing Azure/AWS multi-tenant SaaS infrastructure. Leads cloud strategy, automation, and operational best practices for Smile Digital Health.

🇺🇸 United States – Remote

💵 $135k - $155k / year

💰 $30M Series B on 2023-01

⏰ Full Time

🟠 Senior

☁️ Cloud Engineer

Developed by Lior Neu-ner. I'd love to hear your feedback — Get in touch via DM or support@remoterocketship.com