Software Engineer, Compute Infrastructure

🕒 April 27

🏢🏡 San Francisco – Hybrid

💵 $230k - $405k / year

⏰ Full Time

🟡 Mid-level

🟠 Senior

🧑‍💻 Full-stack Engineer

🦅 H1B Visa Sponsor

info
Apply Now
Find Similar Remote Jobs

📊 Check your resume score for this job

Improve your chances of getting an interview by checking your resume score before you apply.

Logo of OpenAI

OpenAI

WebsiteLinkedIn

201 - 500 employees

Founded 2015

🤖 Artificial Intelligence

☁️ SaaS

🏢 Enterprise

Artificial Intelligence • SaaS • Enterprise

OpenAI is a leading research organization and company dedicated to creating advanced artificial intelligence technology, with a strong emphasis on safety and ethical considerations. OpenAI's mission is to ensure that artificial general intelligence (AGI) benefits all of humanity. The company develops AI products like ChatGPT, which can assist users with tasks ranging from everyday requests to complex enterprise solutions. OpenAI also provides an API platform that integrates its AI models into various applications. The company is focused on innovation in AI and improving data analysis capabilities, while emphasizing safety and ethical governance of their systems.

📋 Description

• Spin up and scale large Kubernetes clusters, including automation for provisioning, bootstrapping, and cluster lifecycle management • Build software abstractions that unify multiple clusters and present a seamless interface to training workloads • Own node bring-up from bare metal through firmware upgrades, ensuring fast, repeatable deployment at massive scale • Improve operational metrics such as reducing cluster restart times (e.g., from hours to minutes) and accelerating firmware or OS upgrade cycles • Integrate networking and hardware health systems to deliver end-to-end reliability across servers, switches, and data center infrastructure • Develop monitoring and observability systems to detect issues early and keep clusters stable under extreme load

🎯 Requirements

• Experience as an infrastructure, systems, or distributed systems engineer in large-scale or high-availability environments • Strong knowledge of Kubernetes internals, cluster scaling patterns, and containerized workloads • Proficiency in compute infrastructure concepts (compute, networking, storage, security) and in automating cluster or data center operations • Bonus: background with GPU workloads, firmware management, or high-performance computing

🏖️ Benefits

• Medical, dental, and vision insurance for you and your family, with employer contributions to Health Savings Accounts • Pre-tax accounts for Health FSA, Dependent Care FSA, and commuter expenses (parking and transit) • 401(k) retirement plan with employer match • Paid parental leave (up to 24 weeks for birth parents and 20 weeks for non-birthing parents), plus paid medical and caregiver leave (up to 8 weeks) • Paid time off: flexible PTO for exempt employees and up to 15 days annually for non-exempt employees • 13+ paid company holidays, and multiple paid coordinated company office closures throughout the year for focus and recharge, plus paid sick or safe time (1 hour per 30 hours worked, or more, as required by applicable state or local law) • Mental health and wellness support • Employer-paid basic life and disability coverage • Annual learning and development stipend to fuel your professional growth • Daily meals in our offices, and meal delivery credits as eligible • Relocation support for eligible employees • Additional taxable fringe benefits, such as charitable donation matching and wellness stipends, may also be provided.

Apply Now

Similar Jobs

🕒 April 25

Mixpanel

201 - 500

☁️ SaaS

🏢 Enterprise

🤝 B2B

WebsiteLinkedIn

Software Engineer developing AI-powered features for Mixpanel's analytics platform. Collaborating with teams to enhance user experience and drive product innovation.

🕒 April 25

Benchling

501 - 1000

☁️ SaaS

🧬 Biotechnology

🤝 B2B

WebsiteLinkedIn

Software Engineer developing AI solutions for biotech R&D. Collaborating to ensure compliance in regulated environments with a focus on data integrity.

🏢🏡 San Francisco – Hybrid

💵 $173.4k - $234.6k / year

💰 $100M Series F - Benchling on 2021-11

⏰ Full Time

🟡 Mid-level

🟠 Senior

🧑‍💻 Full-stack Engineer

🦅 H1B Visa Sponsor

info

🕒 April 25

Rhythms

1 - 10

☁️ SaaS

🏢 Enterprise

⚡ Productivity

WebsiteLinkedIn

Early product engineer at Rhythms building AI that helps businesses operate autonomously. Owning features from design to production in a hybrid engineering role located in San Francisco.

🕒 April 24

Zip

201 - 500

💳 Fintech

☁️ SaaS

🏢 Enterprise

WebsiteLinkedIn

Application engineer leading the Internal AI team at Zip. Building central capabilities to enhance AI adoption and streamline engineering collaboration.

🕒 April 24

Baseten

11 - 50

🤖 Artificial Intelligence

☁️ SaaS

🏢 Enterprise

WebsiteLinkedIn

Software Engineer at Baseten creating and leading Voice AI products. Collaborating on large-scale production systems to influence industry advancements.

🏢🏡 San Francisco – Hybrid

💵 $165k - $330k / year

💰 $8M Seed Round on 2022-04

⏰ Full Time

🟡 Mid-level

🟠 Senior

🧑‍💻 Full-stack Engineer

🦅 H1B Visa Sponsor

info