
10,000+ employees
Founded 1993
🤖 Artificial Intelligence
🎮 Gaming
Artificial Intelligence • Gaming • Automotive
NVIDIA is a leading technology company specializing in accelerated computing and artificial intelligence. NVIDIA pioneers advancements in graphical processing units (GPUs), cloud computing, data centers, and virtual reality, with a focus on gaming, automotive, healthcare, and robotics industries. The company's innovations, such as NVIDIA Omniverse, transform traditional digital processes by enabling high-fidelity simulations and rendering tasks. Their applications span various industries, from autonomous vehicles using NVIDIA DRIVE to healthcare solutions with NVIDIA Clara, and AI-driven analytics and workflows.
🕒 February 3
🏄 California, Oregon, +2 more states – Remote
💵 $184k - $287.5k / year
⏰ Full Time
🟠 Senior
🗣️ LLM Engineer
🦅 H1B Visa Sponsor
Improve your chances of getting an interview by checking your resume score before you apply.

10,000+ employees
Founded 1993
🤖 Artificial Intelligence
🎮 Gaming
Artificial Intelligence • Gaming • Automotive
NVIDIA is a leading technology company specializing in accelerated computing and artificial intelligence. NVIDIA pioneers advancements in graphical processing units (GPUs), cloud computing, data centers, and virtual reality, with a focus on gaming, automotive, healthcare, and robotics industries. The company's innovations, such as NVIDIA Omniverse, transform traditional digital processes by enabling high-fidelity simulations and rendering tasks. Their applications span various industries, from autonomous vehicles using NVIDIA DRIVE to healthcare solutions with NVIDIA Clara, and AI-driven analytics and workflows.
• Develop infrastructure software and tools for large-scale pre-training, post-training, and inference. • Develop and optimize tools and libraries to improve infrastructure efficiency and resiliency. • Co-design and implement APIs for integration with NVIDIA's resiliency stacks. • Enhance infrastructure and products underpinning NVIDIA's AI platforms. • Define meaningful and actionable reliability metrics to track and improve system and service reliability. • Skilled in problem-solving, root cause analysis, and optimization. • Root cause and analyze and triage failures from the application level to the hardware level.
• Minimum of 8+ years of experience in developing software infrastructure for large scale AI systems. • Bachelor's degree or higher in Computer Science or a related technical field (or equivalent experience). • Strong debugging skills and experience in analyzing and triaging AI applications from the application level to the hardware level. • Experience with observability platforms for monitoring and logging (e.g., ELK, Prometheus, Loki). • Proven track record in building and scaling large-scale distributed systems. • Experience with AI training and inferencing infrastructure services. • Proficiency in programming languages such as Python, C/C++, script languages. • Experience in quality software engineering practices, including test development, defensive programming, version control, and CI. • Excellent communication and collaboration skills, and a culture of diversity, intellectual curiosity, problem solving, and openness are essential.
• equity • benefits
Apply Now🕒 January 2
Product Manager responsible for marketing Data Center, GPU, and AI Infrastructure products. Collaborating across teams to take products to market while managing product lifecycles and strategy.
🕒 October 21, 2025
Senior LLM Operations Engineer at N-Power Medicine. Responsible for scaling AI innovation in clinical variable abstraction and note generation through infrastructure and system automation.
AWS
Azure
Cloud
Docker
Google Cloud Platform
Jenkins
Kubernetes
Python
🕒 October 8, 2025
Senior Software Engineer developing Generative AI solutions for Veltris. Leading software development life cycle and driving innovation across products.
AWS
Azure
Cloud
Distributed Systems
Google Cloud Platform
SDLC