Senior ML Evaluation Engineer – Autonomous Vehicles

🕒 April 17

Apply Now
Find Similar Remote Jobs

📊 Check your resume score for this job

Improve your chances of getting an interview by checking your resume score before you apply.

Logo of NVIDIA

NVIDIA

10,000+ employees

Founded 1993

🤖 Artificial Intelligence

🎮 Gaming

Artificial Intelligence • Gaming • Automotive

NVIDIA is a leading technology company specializing in accelerated computing and artificial intelligence. NVIDIA pioneers advancements in graphical processing units (GPUs), cloud computing, data centers, and virtual reality, with a focus on gaming, automotive, healthcare, and robotics industries. The company's innovations, such as NVIDIA Omniverse, transform traditional digital processes by enabling high-fidelity simulations and rendering tasks. Their applications span various industries, from autonomous vehicles using NVIDIA DRIVE to healthcare solutions with NVIDIA Clara, and AI-driven analytics and workflows.

📋 Description

• Design and build learned evaluation pipelines that assess driving behavior using LLMs, VLMs, and multimodal models • Develop agentic workflows that chain model inference, retrieval, and structured reasoning to evaluate complex driving scenarios • Define evaluation-of-evaluation methodology — how do we know our learned evaluators are correct? • Build golden-set frameworks and calibration loops for learned metrics • Partner with AML (Alpamayo Logos) teams on model-specific eval needs (e.g., COT prediction quality, AML regression coverage) • Instrument evaluation systems with robust experiment tracking, A/B comparison tooling, and model versioning • Contribute to the team's transition from rule-based to learned evaluation: identify metrics and analyzers that are candidates for ML replacement and build the alternatives

🎯 Requirements

• PhD with 4+ years, MS with 6+ years, or BS (or equivalent experience) with 8+ years of relevant experience in Computer Science, Computer Engineering, or a related technical field. • Hands-on experience building LLM/VLM-based pipelines — fine-tuning, prompt engineering, retrieval-augmented generation, chain-of-thought • Track record of shipping ML systems to production (not just prototyping or publishing) • Strong software engineering fundamentals — you write clean, tested, reviewable code in Python and C++ • Experience with evaluation methodology: precision/recall, inter-rater reliability, calibration, annotation pipelines • Comfort with large-scale data processing (Spark, Dask, or similar) • Strong Python skills. • Experience with PyTorch or JAX. • Comfortable with GPU-based training workflows.

🏖️ Benefits

• equity • benefits

Apply Now

Similar Jobs

🕒 April 17

Prelude

11 - 50

Forward Deployed Engineer at Origin building tailored solutions for endpoint security using AI. Engaging with customers to enhance system effectiveness in diverse environments.

MacOS

Python

React

Rust

TypeScript

🕒 April 17

Design Hire LLC

11 - 50

🎯 Recruiter

🤝 B2B

Piping Engineer with nuclear experience for piping design and engineering change in power plants. Requires strong mechanical engineering knowledge and hands-on calculation skills.

🕒 April 16

Carrier Commercial Refrigeration

1001 - 5000

🚀 Aerospace

⚡ Energy

⚕️ Healthcare Insurance

Senior CO₂ Commissioning Engineer leading startup and optimization of transcritical CO₂ refrigeration systems in commercial applications. Mentoring junior engineers and ensuring commissioning excellence.

🕒 April 16

Dollar Tree Stores

10,000+ employees

🛒 Retail

👥 B2C

Process Engineer supporting logistics process improvements at Dollar Tree. Designing and executing distribution center process improvements in a fast-paced environment with substantial problem-solving responsibilities.

🇺🇸 United States – Remote

💰 $370M Post-IPO Equity - Dollar Tree on 2019-01

⏰ Full Time

🟡 Mid-level

🟠 Senior

👷🏻‍♀️ Engineer

🕒 April 16

SHI International Corp.

5001 - 10000

🤝 B2B

🔧 Hardware

☁️ SaaS

Managed Services Engineer responsible for Azure Storage solutions at SHI International. Focusing on Qumulo and native Azure services while ensuring platform security and efficiency.

Azure