Staff Software Engineer – AI Systems, Runtimes

1001 - 5000 employees

Founded 2008

🏢 Enterprise

☁️ SaaS

🤖 Artificial Intelligence

💰 $4.1M Venture Round on 2013-01

Enterprise • SaaS • Artificial Intelligence

Cloudera is a leading enterprise data cloud company that empowers businesses to manage and analyze data across any environment. Offering a hybrid data platform, Cloudera facilitates modern data architectures with solutions like open data lakehouse, scalable data mesh, and unified data fabric, designed for artificial intelligence, data engineering, and machine learning. Key industries served include financial services, telecommunications, healthcare, and more, where Cloudera's platform enables secure, scalable, and effective data management. By leveraging AI and advanced analytics at scale, Cloudera helps organizations transform their data into actionable insights.

Staff Software Engineer – AI Systems, Runtimes

🕒 June 2

🏢🏡 San Jose – Hybrid

⏰ Full Time

🔴 Lead

🧑‍💻 Full-stack Engineer

🦅 H1B Visa Sponsor

Docker

JavaScript

Kubernetes

Node.js

Python

Rust

Apply Now

Find Similar Remote Jobs

📊 Check your resume score for this job

Improve your chances of getting an interview by checking your resume score before you apply.

Cloudera

1001 - 5000 employees

Founded 2008

🏢 Enterprise

☁️ SaaS

🤖 Artificial Intelligence

💰 $4.1M Venture Round on 2013-01

Enterprise • SaaS • Artificial Intelligence

📋 Description

• Design and implement elegant, scalable application services (Go/Node.js) that wrap AI capabilities for enterprise use. • Lead the deployment of inference servers (vLLM, Triton) using KServe, KubeRay, or Knative to ensure serverless-style scaling for AI workloads. • Build internal tooling, SDKs, and "AI Gateways" that enhance team agility and simplify the integration of Foundation Models (Llama, GPT) into product features. • Architect robust Retrieval-Augmented Generation (RAG) pipelines and prompt management services that integrate seamlessly with vector databases and enterprise data sources. • Partner with UI engineers, UX designers, and Product Management to ensure the AI platform is not just powerful, but highly usable for internal developers. • Ensure AI workloads are secure, multi-tenant, and optimized for GPU resource scheduling (MIG, fractional GPUs) within Kubernetes.

🎯 Requirements

• Bachelor’s degree with 6+ years of software engineering experience (or equivalent Masters/PhD tenure), with at least 2+ years focused on AI/ML systems. • Expert proficiency in Python (for AI ecosystem) and strong competence in a systems language like Go or Rust/C++ (for high-performance serving layers). • Deep understanding of LLM deployment challenges and runtimes (e.g., vLLM, ONNX, TorchServe, Triton). • Familiarity with quantization techniques (AWQ, GPTQ) to optimize model size/speed. • Experience building complex workflows using tools like LangChain or LlamaIndex, and deploying them on containerized infrastructure (Docker/Kubernetes). • Ability to navigate the rapidly changing AI landscape, filtering hype from practical engineering solutions, and driving technical alignment across teams.

🏖️ Benefits

• Generous PTO Policy • Support work life balance with Unplugged Days • Flexible WFH Policy • Mental & Physical Wellness programs • Phone and Internet Reimbursement program • Access to Continued Career Development • Comprehensive Benefits and Competitive Packages • Paid Volunteer Time • Employee Resource Groups

Apply Now

Similar Jobs

Principal Engineer

🕒 May 30

Vishay Intertechnology, Inc.

10,000+ employees

🔧 Hardware

🤝 B2B

📡 Telecommunications

Principal Engineer leading and maintaining R&D strategy with technological trends at Vishay. Collaborating with stakeholders and overseeing project implementations in product development.

🏢🏡 San Jose – Hybrid

⏰ Full Time

🔴 Lead

🧑‍💻 Full-stack Engineer

Staff Software Engineer, Core Platform

🕒 May 21

FloQast

501 - 1000

💸 Finance

☁️ SaaS

🤖 Artificial Intelligence

Key technical leader designing foundational backend systems at FloQast. Focus on scalable architecture and critical platform capabilities.

🏢🏡 San Jose – Hybrid

💰 $110M Series D on 2021-07

⏰ Full Time

🔴 Lead

🧑‍💻 Full-stack Engineer

🦅 H1B Visa Sponsor

AWS

Cloud

Distributed Systems

Kafka

MongoDB

TypeScript

Principal Engineer, GPU Architect – Modeling

🕒 May 20

Samsung Electronics

10,000+ employees

🔧 Hardware

🛍️ eCommerce

Principal Engineer leading GPU architecture and modeling strategies for next-gen mobile GPUs at Samsung. Driving innovation and cross-functional collaboration to enhance graphics performance and efficiency.

🏢🏡 San Jose – Hybrid

💵 $221.7k - $364.8k / year

⏰ Full Time

🔴 Lead

🧑‍💻 Full-stack Engineer

🦅 H1B Visa Sponsor

Ray

Principal Engineer – GPU Design Verification

🕒 May 14

Samsung Electronics

10,000+ employees

🔧 Hardware

🛍️ eCommerce

Principal Engineer leading verification for next-gen GPU subsystems at Samsung. Ensuring compliance with architecture and enhancing design quality through expertise in verification methodologies.

🏢🏡 San Jose – Hybrid

💵 $221.7k - $364.8k / year

⏰ Full Time

🔴 Lead

🧑‍💻 Full-stack Engineer

🦅 H1B Visa Sponsor

Perl

Python