
10,000+ employees
Founded 1993
🤖 Artificial Intelligence
🎮 Gaming
Artificial Intelligence • Gaming • Automotive
NVIDIA is a leading technology company specializing in accelerated computing and artificial intelligence. NVIDIA pioneers advancements in graphical processing units (GPUs), cloud computing, data centers, and virtual reality, with a focus on gaming, automotive, healthcare, and robotics industries. The company's innovations, such as NVIDIA Omniverse, transform traditional digital processes by enabling high-fidelity simulations and rendering tasks. Their applications span various industries, from autonomous vehicles using NVIDIA DRIVE to healthcare solutions with NVIDIA Clara, and AI-driven analytics and workflows.
🔥 0 minutes ago
Improve your chances of getting an interview by checking your resume score before you apply.

10,000+ employees
Founded 1993
🤖 Artificial Intelligence
🎮 Gaming
Artificial Intelligence • Gaming • Automotive
NVIDIA is a leading technology company specializing in accelerated computing and artificial intelligence. NVIDIA pioneers advancements in graphical processing units (GPUs), cloud computing, data centers, and virtual reality, with a focus on gaming, automotive, healthcare, and robotics industries. The company's innovations, such as NVIDIA Omniverse, transform traditional digital processes by enabling high-fidelity simulations and rendering tasks. Their applications span various industries, from autonomous vehicles using NVIDIA DRIVE to healthcare solutions with NVIDIA Clara, and AI-driven analytics and workflows.
• Primary role is to perform coordination and communication across NVIDIA’s datacenter portfolio from an operations perspective regarding incidents, maintenance, and reporting/monitoring. • Develop standards and programs in support of reliability and operations initiatives, including Problem and Change Control, and define and maintain a health score for sites and environments, including testing methods to predict and isolate points of failure, assessing and advising on maintenance strategies, and providing related reporting and metrics. • Study failure data and work with machine learning and AI teams and tools to predict future failures, and facilitate reliability studies such as critical assessments, RAM models, and RCM studies. • Identify and drive automation & process improvement opportunities across catalog quality workflows and reporting. • Coordinate disaster recovery tests, liaise during audits, collaborate with internal partners, and make vital progress to ensure business continuity and compliance. • Perform risk assessments to ensure compliance with policies, procedures, rules & regulations, and data center standards. • Own and present end-to-end key business metrics related to incident response, including ownership and representation of internal and external tooling. • Lead root cause analysis for outages and adjust documentation, workflows, and operating procedures to avoid future incidents. • Assess process improvement & transformation opportunities and partner with process owners & collaborators to scope opportunities, define problem statements and objectives, and structure projects and teams. • Work multi-functionally with other team members and groups within the organization, and develop strong, productive relationships across peer organizations that further the organization's business objectives.
• Bachelor’s degree in a related field (e.g., Electrical Engineering, Mechanical Engineering, Industrial Engineering, Computer Engineering, Telecommunication Engineering, Computer Science, or business-related field) or equivalent experience. • 5+ years of operations or environmental, health, and safety experience within data centers. • Proficient in developing and driving reliability activities (modeling predictions, life cycle testing, stress testing, etc.). • Commercial and financial awareness, with a full comprehension of the impact of failure in translation to business costs, production targets, and fulfillment of customer orders. • Highly developed numeracy, statistical, and reporting skills; ability to analyze, interpret, and apply information, data, and trends. • Enthusiastic about achieving goals and maintaining organization, capable of strategizing and meeting set objectives. • Demonstrated ability to be meticulous, organized, and capable of consolidating data analyses for presentation to large-scale groups. • Proficient in the use of asset database and DCIM solutions to extract data and develop meaningful insights. • Experience in designing, deploying, or maintaining large-scale datacenter infrastructure (whether ACSMEP or networking) or the ability to create strategic infrastructure roadmaps including on-premise, hybrid, and cloud technologies. • Demonstrated knowledge and advanced proficiency working with Microsoft Office Suite software and G-Suite software.
• Health insurance • Professional development opportunities
Apply Now🔥 15 hours ago
AI Security Engineer at J.S. Held focusing on security for AI technologies and applications. Engineering controls for cloud and AI systems, ensuring secure implementation and monitoring.
🇮🇳 India – Remote
💰 Private Equity Round on 2015-10
⏰ Full Time
🟠 Senior
🔴 Lead
👮♂️ Cybersecurity / Security Engineer
Azure
Cloud
Cyber Security
🕒 2 days ago
Cloud Security Architect guiding the design and implementation of secure cloud architectures for Gruve. Leading security governance and ensuring compliance across cloud platforms.
AWS
Azure
Cloud
Google Cloud Platform
Kubernetes
Python
🕒 3 days ago
Senior Analyst managing information security governance and awareness programs at Oportun. Ensuring compliance with security frameworks and enhancing organizational security culture.
Cyber Security
🕒 4 days ago
Workday Security Senior Manager overseeing Workday security architecture and compliance for Fortrea. Collaborating with HR, Finance, IT, and Compliance teams to safeguard sensitive data access.
🕒 6 days ago
AI Security Engineer at Credit Acceptance securing AI and ML systems with a focus on compliance and risk mitigation. Collaborate globally to embed security into AI systems by design.
AWS
Azure
Cloud
Google Cloud Platform