Senior HPC Systems Engineer

November 4, 2023

Apply Now
Lambda logo

Lambda

Designing the world's most advanced GPU systems for Deep Learning.

Deep Learning • Machine Learning • Artificial Intelligence

51 - 200

💰 $39.7M Venture Round on 2022-11

Description

• Design and architect the state-of-the-art AI supercomputers powering our cloud • Introduce technology and software to improve the performance, resiliency, and quality of service of our HPC storage and networking infrastructure • Work closely with our ML team to benchmark, tune, and optimize our hypervisors, network, and storage • Set up monitoring, logging and alerting to ensure high availability and observability • Provide guidance and represent the interests of our HPC customers

Requirements

• Expertise with architecting, operating, and debugging large scale HPC network and storage infrastructure, ideally using MPI, NCCL, RDMA, Infiniband, and parallel file systems • Experience building complex, high-quality software using Python • Deep understanding of Linux fundamentals, especially its networking stack • Experience with large GPU clusters is strongly preferred • Experience with virtualization and kubernetes • Strong engineering background - Computer Science, Electrical Engineering, Mathematics, Physics • Have led and taken full ownership over large, ambiguous, cross team projects from conception to production • Enjoy moving fast and making a large business impact • Value working on a team of high performers that hold each other accountable • Are a self-starter, curious, and not afraid to ask when in doubt • Are a quick learner and enjoy learning new technologies • Value working on a low ego team that emphasizes strong communication, collaboration, and getting to the right answer as a team

Benefits

• We offer generous cash & equity compensation • Investors include Gradient Ventures, Google’s AI-focused venture fund • We are experiencing extremely high demand for our systems, with quarter over quarter, year over year profitability • Our research papers have been accepted into top machine learning and graphics conferences, including NeurIPS, ICCV, SIGGRAPH, and TOG • We have a wildly talented team of 200, and growing fast • Health, dental, and vision coverage for you and your dependents • Commuter/Work from home stipends • 401k Plan with 2% company match • Flexible Paid Time Off Plan that we all actually use

Apply Now
Built by Lior Neu-ner. I'd love to hear your feedback — Get in touch via DM or lior@remoterocketship.com
Jobs by Title
Remote Account Executive jobsRemote Accounting, Payroll & Financial Planning jobsRemote Administration jobsRemote Android Engineer jobsRemote Backend Engineer jobsRemote Business Operations & Strategy jobsRemote Chief of Staff jobsRemote Compliance jobsRemote Content Marketing jobsRemote Content Writer jobsRemote Copywriter jobsRemote Customer Success jobsRemote Customer Support jobsRemote Data Analyst jobsRemote Data Engineer jobsRemote Data Scientist jobsRemote DevOps jobsRemote Ecommerce jobsRemote Engineering Manager jobsRemote Executive Assistant jobsRemote Full-stack Engineer jobsRemote Frontend Engineer jobsRemote Game Engineer jobsRemote Graphics Designer jobsRemote Growth Marketing jobsRemote Hardware Engineer jobsRemote Human Resources jobsRemote iOS Engineer jobsRemote Infrastructure Engineer jobsRemote IT Support jobsRemote Legal jobsRemote Machine Learning Engineer jobsRemote Marketing jobsRemote Operations jobsRemote Performance Marketing jobsRemote Product Analyst jobsRemote Product Designer jobsRemote Product Manager jobsRemote Project & Program Management jobsRemote Product Marketing jobsRemote QA Engineer jobsRemote SDET jobsRemote Recruitment jobsRemote Risk jobsRemote Sales jobsRemote Scrum Master + Agile Coach jobsRemote Security Engineer jobsRemote SEO Marketing jobsRemote Social Media & Community jobsRemote Software Engineer jobsRemote Solutions Engineer jobsRemote Support Engineer jobsRemote Technical Writer jobsRemote Technical Product Manager jobsRemote User Researcher jobs