Senior GPU Infrastructure Engineer

Job not on LinkedIn

🕒 May 18

Apply Now
Find Similar Remote Jobs

📊 Check your resume score for this job

Improve your chances of getting an interview by checking your resume score before you apply.

Logo of decircle

decircle

1 - 10 employees

Founded 2019

We partner with disruptive organizations to help identify and attract talent. A boutique recruitment agency that focuses on blockchain and web3. We enjoy networking with Web3 and decentralization enthusiasts and help them find their place in the decentralized world.

📋 Description

• Help build and scale Hyperbolic's GPU Cloud Marketplace • Build a multi-tenancy provisioning and virtualization solution • Transform raw GPUs from diverse global suppliers into a programmable, orchestrated pool • Serve thousands of AI developers and researchers • Work at the cutting edge of cloud infrastructure • Build the core orchestration layer that enables the platform to deliver up to 75% cost savings compared to traditional cloud providers

🎯 Requirements

• Deep understanding of bare-metal provisioning and lifecycle management, including IPMI/Redfish, BMC-based remote management, PXE boot, and automated OS deployment workflows • Deep understanding of GPU scheduling and orchestration, including GPU type awareness, memory management, topology considerations, placement strategies for multi-GPU jobs, and fragmentation minimization • Strong infrastructure and DevOps engineering skills with proficiency in Terraform or Pulumi, CI/CD for infrastructure, secrets management, configuration management, and observability stack implementation • Experience with storage and data infrastructure for AI/ML workloads, including object storage, high-IOPS block storage, and distributed file systems for training data and checkpoints • Proficiency with API design and cloud-init for automated provisioning and configuration • Solid understanding of GPU architecture, CUDA, and GPU compute optimization • Highly collaborative team player with excellent communication skills across technical and non-technical stakeholders • Proven ability to work effectively with hardware vendors and vendor engineering teams to troubleshoot issues and optimize integrations • Experience building and scaling cloud infrastructure or distributed systems in production environments

Apply Now

Similar Jobs

🕒 May 15

Progressive Leasing

1001 - 5000

🛒 Retail

💸 Finance

💳 Fintech

Infrastructure Systems Engineer supporting AWS-based infrastructure for Progressive Leasing, ensuring reliability and scalability in remote work environment.

AWS

Bootstrap

Cloud

DNS

Kubernetes

Linux

Terraform

🕒 May 15

Jito Labs

1 - 10

Infrastructure Engineer responsible for operating end-to-end infrastructure for high-performance trading platform on Solana. Focusing on cloud environments, automation, and security tasks.

Ansible

Cloud

DNS

Grafana

Linux

Prometheus

Terraform

Vault

🕒 May 14

Chainalysis Inc.

501 - 1000

🔌 API

💳 Fintech

🔒 Cybersecurity

Senior Infrastructure Engineer responsible for cloud infrastructure and Kubernetes operations at Chainalysis. Leading architectural decisions and mentoring engineers for high-security deployment platforms.

AWS

Cloud

EC2

Flux

Jenkins

Kubernetes

Linux

Shell Scripting

Terraform

🕒 May 14

Bullhorn

1001 - 5000

👥 HR Tech

☁️ SaaS

🎯 Recruiter

Infrastructure Engineer II managing datacenter operations and server administration at Bullhorn. Focused on high-quality service delivery and infrastructure project management in a dynamic setting.

AWS

Azure

Cloud

Google Cloud Platform

Linux

NFS

TCP/IP

🕒 May 14

Rocket Money (formerly Truebill)

51 - 200

💸 Finance

💳 Fintech

👥 B2C

Senior Infrastructure Engineer leading cloud security and evolving infrastructure strategies at Rocket Money. Join a team supporting millions in improving financial lives securely and at scale.

AWS

Cloud

Firewalls

Google Cloud Platform

Terraform