Staff Software Engineer – AI Systems, Runtimes

🕒 April 2

🏢🏡 Austin – Hybrid

⏰ Full Time

🔴 Lead

🧑‍💻 Full-stack Engineer

🦅 H1B Visa Sponsor

info
Apply Now
Find Similar Remote Jobs

📊 Check your resume score for this job

Improve your chances of getting an interview by checking your resume score before you apply.

Logo of Cloudera

Cloudera

WebsiteLinkedIn

1001 - 5000 employees

Founded 2008

🏢 Enterprise

☁️ SaaS

🤖 Artificial Intelligence

💰 $4.1M Venture Round on 2013-01

Enterprise • SaaS • Artificial Intelligence

Cloudera is a leading enterprise data cloud company that empowers businesses to manage and analyze data across any environment. Offering a hybrid data platform, Cloudera facilitates modern data architectures with solutions like open data lakehouse, scalable data mesh, and unified data fabric, designed for artificial intelligence, data engineering, and machine learning. Key industries served include financial services, telecommunications, healthcare, and more, where Cloudera's platform enables secure, scalable, and effective data management. By leveraging AI and advanced analytics at scale, Cloudera helps organizations transform their data into actionable insights.

📋 Description

• Design and implement elegant, scalable application services (Go/Node.js) that wrap AI capabilities for enterprise use. • Lead the deployment of inference servers (vLLM, Triton) using KServe, KubeRay, or Knative to ensure serverless-style scaling for AI workloads. • Build internal tooling, SDKs, and "AI Gateways" that enhance team agility and simplify the integration of Foundation Models (Llama, GPT) into product features. • Architect robust Retrieval-Augmented Generation (RAG) pipelines and prompt management services that integrate seamlessly with vector databases and enterprise data sources. • Partner with UI engineers, UX designers, and Product Management to ensure the AI platform is not just powerful, but highly usable for internal developers. • Ensure AI workloads are secure, multi-tenant, and optimized for GPU resource scheduling (MIG, fractional GPUs) within Kubernetes.

🎯 Requirements

• Bachelor’s degree with 6+ years of software engineering experience (or equivalent Masters/PhD tenure), with at least 2+ years focused on AI/ML systems. • Expert proficiency in Python (for AI ecosystem) and strong competence in a systems language like Go or Rust/C++ (for high-performance serving layers). • Deep understanding of LLM deployment challenges and runtimes (e.g., vLLM, ONNX, TorchServe, Triton). • Familiarity with quantization techniques (AWQ, GPTQ) to optimize model size/speed. • Experience building complex workflows using tools like LangChain or LlamaIndex, and deploying them on containerized infrastructure (Docker/Kubernetes). • Ability to navigate the rapidly changing AI landscape, filtering hype from practical engineering solutions, and driving technical alignment across teams.

🏖️ Benefits

• Generous PTO Policy • Support work life balance with Unplugged Days • Flexible WFH Policy • Mental & Physical Wellness programs • Phone and Internet Reimbursement program • Access to Continued Career Development • Comprehensive Benefits and Competitive Packages • Paid Volunteer Time • Employee Resource Groups

Apply Now

Similar Jobs

🕒 April 1

Bumble Inc.

501 - 1000

👥 B2C

🌍 Social Impact

WebsiteLinkedIn

Principal Software Engineer leading architecture and AI agent systems for Bumble's dating and friendship platforms. Collaborating across teams to deliver robust features and user assistance.

🏢🏡 Austin – Hybrid

💵 $280k - $320k / year

⏰ Full Time

🔴 Lead

🧑‍💻 Full-stack Engineer

🕒 March 26

GlobalFoundries

10,000+ employees

WebsiteLinkedIn

Principal Engineer working with global teams on EDA functions and cloud transformations at GlobalFoundries. Focused on process optimizations and business transformation projects.

🕒 March 18

General Motors

10,000+ employees

🚗 Transport

⚡ Energy

🏢 Enterprise

WebsiteLinkedIn

Staff Software Engineer developing embedded radio software for General Motors. Focused on Linux kernel development and low-level system integration for in-vehicle infotainment systems.

🕒 March 18

General Motors

10,000+ employees

🚗 Transport

⚡ Energy

🏢 Enterprise

WebsiteLinkedIn

Staff Software Engineer developing Linux kernel and device drivers for General Motors' automotive systems. Leading system integration efforts and collaborating with hardware teams for optimal software performance.

🕒 March 17

Upside

201 - 500

💸 Finance

🛒 Retail

WebsiteLinkedIn

Staff Full Stack Engineer building internal platforms to enhance efficiency for Upside's go-to-market teams. Focused on AI-native systems for improved operational workflows and performance analysis.

🏢🏡 Austin – Hybrid

💵 $214k - $245k / year

💰 $100M Debt Financing on 2022-04

⏰ Full Time

🔴 Lead

🧑‍💻 Full-stack Engineer

🦅 H1B Visa Sponsor

info