
11 - 50 employees
🔧 Hardware
🏢 Enterprise
🤖 Artificial Intelligence
💰 $10M Seed Round on 2022-04
Hardware • Enterprise • Artificial Intelligence
Hydra Host is a provider of high-performance computing solutions, offering dedicated bare metal GPU server access optimized for AI and HPC workloads. Their platform allows users to access and rent top-tier GPUs globally, providing unparalleled performance, security, and customization. Hydra Host's infrastructure includes a marketplace, known as Brokkr, that offers a wide array of GPU configurations and solutions tailored for mission-critical applications such as AI, big data, and machine learning. Through their robust, secure, and scalable solutions, Hydra Host ensures customers enjoy full control over their server environments, with options for scalability and future-readiness. The company's offerings are trusted by leading firms seeking efficient and innovative computing solutions.
🕒 February 10
🇺🇸 United States – Remote
💵 $150k - $225k / year
⏰ Full Time
🟡 Mid-level
🟠 Senior
👷 Infrastructure Engineer
Improve your chances of getting an interview by checking your resume score before you apply.

11 - 50 employees
🔧 Hardware
🏢 Enterprise
🤖 Artificial Intelligence
💰 $10M Seed Round on 2022-04
Hardware • Enterprise • Artificial Intelligence
Hydra Host is a provider of high-performance computing solutions, offering dedicated bare metal GPU server access optimized for AI and HPC workloads. Their platform allows users to access and rent top-tier GPUs globally, providing unparalleled performance, security, and customization. Hydra Host's infrastructure includes a marketplace, known as Brokkr, that offers a wide array of GPU configurations and solutions tailored for mission-critical applications such as AI, big data, and machine learning. Through their robust, secure, and scalable solutions, Hydra Host ensures customers enjoy full control over their server environments, with options for scalability and future-readiness. The company's offerings are trusted by leading firms seeking efficient and innovative computing solutions.
• Get AI Platform customers production-ready on Hydra — standing up Kubernetes clusters, configuring GPU drivers, validating networking, and troubleshooting the issues that surface when real workloads hit real hardware. • Own the bare metal ←→ platform layer — bridging GPU infrastructure (NCCL, InfiniBand, NVLink, storage) with orchestration layers (Kubernetes, SLURM) and MLOps tooling that customers actually use. • Configure, benchmark, and debug NVIDIA driver stacks — firmware versions, CUDA compatibility, NCCL tuning, MIG configurations. • Run quality benchmarks and diagnostics to validate performance for inference and training workloads across chip types. • Identify gaps before customers do — pressure-testing Hydra's infrastructure, APIs, and workflows to find what's missing or broken. • Turn customer learnings into product — working with Product and Engineering to build reusable templates, default configurations, and automated workflows that eliminate manual onboarding. • Advise customers on chip selection and tokenomics — helping AI platform customers understand price/performance trade-offs across GPU types, cost-per-token economics, and which hardware fits their inference or training workloads.
• Bare metal Linux depth — you've administered GPU servers at the metal: driver stacks, kernel tuning, firmware, storage configuration. Not just managed K8s. • NVIDIA GPU stack expertise — drivers, CUDA, NCCL, NVLink, nvidia-smi profiling. You understand how stack compatibility affects performance. • Kubernetes and orchestration — production experience with K8s, SLURM, or similar. You know how to stand up clusters, not just deploy to them. • AI Networking fundamentals — TCP/IP, VLANs, bonding, and high-speed interconnects (InfiniBand, RoCE) for distributed workloads. • Customer-facing communication — you can work directly with engineers at AI platform companies, understand their constraints, and translate that into clear requirements for your team. • Bias toward scalable solutions — you'd rather build a feature that helps 10 customers than a custom deployment that helps 1. • Nice to Have HPC or large-scale distributed training environments. • AI workload experience (vLLM, PyTorch, inference frameworks). • Storage systems (NVMe, distributed filesystems, CEPH, WEKA). • IaC and provisioning tools (Terraform, Ansible, Cloud-init, MaaS).
• Competitive salary • Equity ownership • Healthcare — medical, dental, vision for you and your family • Remote-first — with hubs in Phoenix, Boulder, and Miami • Direct impact — your work shapes how GPU infrastructure gets deployed across the AI ecosystem
Apply Now🕒 February 6
Infrastructure Engineer managing Kubernetes clusters and enhancing networking security for a serverless edge compute platform at Telnyx.
🇺🇸 United States – Remote
💰 $2.1M Seed Round on 2014-08
⏰ Full Time
🟡 Mid-level
🟠 Senior
👷 Infrastructure Engineer
🦅 H1B Visa Sponsor
Ansible
Cloud
Firewalls
Kubernetes
Linux
Prometheus
Terraform
🕒 February 2
Senior Cloud Data Infrastructure Engineer at ClickHouse building cloud-native database platforms. Collaborating on autoscaling solutions and enhancing cloud infrastructure performance.
🇺🇸 United States – Remote
💵 $133.4k - $197.2k / year
⏰ Full Time
🟠 Senior
👷 Infrastructure Engineer
🦅 H1B Visa Sponsor
AWS
Azure
Cloud
Distributed Systems
EC2
Google Cloud Platform
Java
Kafka
Kubernetes
Numpy
Pandas
Python
Spark
Go
🕒 January 30
51 - 200
Lead Infrastructure Engineer defining and shaping infrastructure at Atticus. Working closely with product teams to develop necessary platforms and tools for efficient operations.
Cloud
Google Cloud Platform
Terraform
🕒 January 29
Senior Infrastructure Engineer for Earth Species Project. Designing scalable AI data pipelines to decode animal communication with advanced AI and supporting infrastructure team growth.
🇺🇸 United States – Remote
💵 $225.5k - $235.5k / year
⏰ Full Time
🟠 Senior
👷 Infrastructure Engineer
🦅 H1B Visa Sponsor
Apache
AWS
Azure
BigQuery
Cloud
Distributed Systems
Docker
Google Cloud Platform
Kubernetes
Python
PyTorch
Spark
Terraform
🕒 January 15
Senior Infrastructure Engineer designing and securing enterprise infrastructure for healthcare client in Texas. Responsible for ensuring system stability, resiliency, and security across hybrid and cloud environments.
Ansible
Cloud
DNS
Linux
Python
VMware