Senior Engineering Manager – Data Center Telemetry, RAS

Job not on LinkedIn

November 18

Apply Now
Logo of NVIDIA

NVIDIA

Artificial Intelligence • Gaming • Automotive

NVIDIA is a leading technology company specializing in accelerated computing and artificial intelligence. NVIDIA pioneers advancements in graphical processing units (GPUs), cloud computing, data centers, and virtual reality, with a focus on gaming, automotive, healthcare, and robotics industries. The company's innovations, such as NVIDIA Omniverse, transform traditional digital processes by enabling high-fidelity simulations and rendering tasks. Their applications span various industries, from autonomous vehicles using NVIDIA DRIVE to healthcare solutions with NVIDIA Clara, and AI-driven analytics and workflows.

10,000+ employees

Founded 1993

🤖 Artificial Intelligence

🎮 Gaming

📋 Description

• Lead Data Center Compute Telemetry & RAS: Own the end-to-end architecture and delivery for telemetry solutions, including fleet health monitoring, fault remediation, and data visualization at scale. • Owning OOB telemetry solution and data validation for telemetry from each underlying device. • Build and Mentor a World-Class Team: Recruit, develop, and motivate a high-performing engineering team focused on platform telemetry, RAS and observability. • Process Optimization: Continuously improve software development processes for optimal productivity and quality. • Cross-Functional Collaboration: Work across teams to ensure seamless integration of telemetry solutions with platform firmware, server architecture, and data center management. • Product Ownership: Drive product life cycles with QA teams, ensuring robust testing, productization, and delivery. • Performance Management: Conduct performance reviews, foster a culture of excellence, and ensure high productivity.

🎯 Requirements

• 12+ overall years of relevant experience and 5 yrs of managing systems/platform software teams, ideally in server RAS, firmware, telemetry, or data center solutions. • BS, MS, or PhD in EE/CS or related field (or equivalent experience). • Strong knowledge of DMTF/PLDM for OOB telemetry collection, time series databases (e.g., InfluxDB, Prometheus) and REST APIs (Redfish). • Deep understanding of Server and firmware architecture and optimization for low-latency APIs. • Proven track record of delivering scalable server products and telemetry solutions. • Experience with SCM (Git, Perforce) and project management tools (Jira). • Excellent written and oral communication skills, strong work ethic, and commitment to teamwork. • Hands-on experience with x86/ARM system architecture and coding (C/C++, Python). • Familiarity with Confidential Compute and notification systems. • Demonstrated ability to analyze algorithms for time/space complexity and system resource requirements.

🏖️ Benefits

• Equity • Benefits

Apply Now

Similar Jobs

November 18

GitLab

1001 - 5000

🤖 Artificial Intelligence

🏢 Enterprise

☁️ SaaS

Engineering Manager leading Database Reliability, Scalability & Operations for GitLab’s AI-powered platform. Responsible for team management and technical leadership in database strategies.

🇺🇸 United States – Remote

💵 $131.6k - $282k / year

💰 Secondary Market on 2020-11

⏰ Full Time

🟡 Mid-level

🟠 Senior

👮‍♀️ Software Engineering Manager

November 18

SentinelOne

1001 - 5000

🔒 Cybersecurity

🤖 Artificial Intelligence

☁️ SaaS

Engineering Manager leading globally distributed System and Kernel developers within the macOS operating system at SentinelOne. Direct involvement in core technology and improving agent architecture.

🇺🇸 United States – Remote

💵 $160k - $220k / year

⏰ Full Time

🟠 Senior

🔴 Lead

👮‍♀️ Software Engineering Manager

🦅 H1B Visa Sponsor

November 18

Helius Sunlink PV

51 - 200

⚡ Energy

Engineering Manager leading Product Experience at Helius, building core solutions for crypto applications. Driving team performance and product quality in a fast-paced environment.

🇺🇸 United States – Remote

⏰ Full Time

🟡 Mid-level

🟠 Senior

👮‍♀️ Software Engineering Manager

November 18

CVS Health

10,000+ employees

⚕️ Healthcare Insurance

🛒 Retail

🧘 Wellness

Senior Software Engineering Manager leading back-end engineering teams to enhance healthcare digital services at CVS Health. Focus on cloud technologies and AI integration for exceptional member experiences.

🇺🇸 United States – Remote

💵 $106.6k - $260.6k / year

⏰ Full Time

🟠 Senior

👮‍♀️ Software Engineering Manager

November 17

Raya

51 - 200

🌍 Social Impact

👥 B2C

📱 Media

Senior Engineering Manager leading a dynamic team specializing in Node.js at Raya. Focused on project management and innovative software development for a utility-driven app.

Developed by Lior Neu-ner. I'd love to hear your feedback — Get in touch via DM or support@remoterocketship.com