
10,000+ employees
Founded 1993
🤖 Artificial Intelligence
🎮 Gaming
Artificial Intelligence • Gaming • Automotive
NVIDIA is a leading technology company specializing in accelerated computing and artificial intelligence. NVIDIA pioneers advancements in graphical processing units (GPUs), cloud computing, data centers, and virtual reality, with a focus on gaming, automotive, healthcare, and robotics industries. The company's innovations, such as NVIDIA Omniverse, transform traditional digital processes by enabling high-fidelity simulations and rendering tasks. Their applications span various industries, from autonomous vehicles using NVIDIA DRIVE to healthcare solutions with NVIDIA Clara, and AI-driven analytics and workflows.
🔥 0 minutes ago
Improve your chances of getting an interview by checking your resume score before you apply.

10,000+ employees
Founded 1993
🤖 Artificial Intelligence
🎮 Gaming
Artificial Intelligence • Gaming • Automotive
NVIDIA is a leading technology company specializing in accelerated computing and artificial intelligence. NVIDIA pioneers advancements in graphical processing units (GPUs), cloud computing, data centers, and virtual reality, with a focus on gaming, automotive, healthcare, and robotics industries. The company's innovations, such as NVIDIA Omniverse, transform traditional digital processes by enabling high-fidelity simulations and rendering tasks. Their applications span various industries, from autonomous vehicles using NVIDIA DRIVE to healthcare solutions with NVIDIA Clara, and AI-driven analytics and workflows.
• Primary responsibilities will include building AI/HPC infrastructure for new and existing customers. • Support operational and reliability aspects of large-scale AI clusters, focusing on performance at scale, real-time monitoring, logging, and alerting. • Engage in and improve the whole lifecycle of services—from inception and design through deployment, operation, and refinement. • Maintain services once they are live by measuring and monitoring availability, latency, and overall system health. • Provide feedback to internal teams such as opening bugs, documenting workarounds, and suggesting improvements.
• BS/MS/PhD or equivalent experience in Computer Science, Electrical/Computer Engineering, Physics, Mathematics, or related fields. • At least 5+ years of professional experience in networking fundamentals, Ethernet or InfiniBand World. • Hands-on experience with network switch/router platforms like Cumulus Linux, SONiC, IOS, JunosOS, and EOS, etc. • Possess solid working knowledge of Ethernet/InfiniBand/RDMA core principles. • Be proficient in end-to-end IB/Eth cluster deployment, adapter configuration and firmware maintenance, and able to conduct professional performance benchmarking with mainstream RDMA testing tools. • Capable of independently diagnosing and troubleshooting typical IB/Eth network anomalies, including link flapping, connection failure, as well as bandwidth and latency jitter issues. • Master practical RDMA network optimization strategies such as QP tuning, MTU configuration and congestion control optimization. • Hands-on working experience in RDMA-accelerated business scenarios, including distributed storage and high-performance computing clusters. • Extensive experience delivering automated network provisioning solutions using tools like Ansible, Salt, and Python. • Ability to develop CI/CD pipelines for network operations. • Strong written, verbal, and listening skills in English are essential.
• NVIDIA pioneered accelerated computing. • Our AI infrastructure powers global intelligence, transforming every industry.
Apply Now🔥 3 hours ago
Partner Solutions Engineer bridging Cloudera’s cutting-edge technology with global partner ecosystem. Driving partnership proficiency and solving complex technical hurdles while maximizing value of Cloudera Data Platform.
Linux
Spark
🕒 Yesterday
Solutions Architect leading design and delivery of cloud data analytics solutions for enterprise customers at phData. Collaborating with teams and guiding implementation for high-quality outcomes.
Airflow
AWS
Azure
Cassandra
Cloud
ElasticSearch
Google Cloud Platform
HDFS
Informatica
Java
Kafka
Matillion
NoSQL
Python
Scala
Spark
SQL
🕒 Yesterday
Agentic Solution Engineer building and scaling workflows with Netomi’s no-code platform for AI. Collaborating with teams to deliver autonomous AI solutions for enterprise customer experience.
🕒 3 days ago
Solution Architect defining Power BI architecture and governance, performing tenant-to-tenant migration for Kyndryl. Establishing runbooks and coordinating dependencies while ensuring operational reporting.
🕒 June 5
Solution Engineer, Sr providing technical expertise and support for ERP sales opportunities at Epicor. Coaching Solution Engineers and conducting technical presentations for clients.
ERP