
11 - 50 employees
🤖 Artificial Intelligence
Artificial Intelligence • Cloud Computing
FluidStack is a company that provides GPU supercomputing infrastructure for AI labs. It offers on-demand access to thousands of Nvidia GPUs, enabling large-scale AI training and inference. The company specializes in deploying and managing large GPU clusters with support for technologies like Kubernetes and Slurm, ensuring high availability and excellent support. FluidStack provides a fully managed cloud infrastructure, helping AI companies to focus on developing models without worrying about the underlying hardware. They emphasize performance and cost-efficiency, offering services that scale to thousands of GPUs with high uptime and rapid response times.
🔥 0 minutes ago
🗽 New York – Remote
💵 $60 - $75 / hour
⏳ Contract/Temporary
🟡 Mid-level
🟠 Senior
⛑ DevOps & Site Reliability Engineer (SRE)
Improve your chances of getting an interview by checking your resume score before you apply.

11 - 50 employees
🤖 Artificial Intelligence
Artificial Intelligence • Cloud Computing
FluidStack is a company that provides GPU supercomputing infrastructure for AI labs. It offers on-demand access to thousands of Nvidia GPUs, enabling large-scale AI training and inference. The company specializes in deploying and managing large GPU clusters with support for technologies like Kubernetes and Slurm, ensuring high availability and excellent support. FluidStack provides a fully managed cloud infrastructure, helping AI companies to focus on developing models without worrying about the underlying hardware. They emphasize performance and cost-efficiency, offering services that scale to thousands of GPUs with high uptime and rapid response times.
• Deploy and validate data center network infrastructure (front-end, back-end, BMS, management): configure switches, install and validate optics, coordinate cabling, and drive deployments to completion. • Ensure physical connectivity meets production standards: coordinate fiber remediation, validate insertion loss and OTDR traces, troubleshoot optical-layer issues, and document as-builts. • Manage hardware logistics: device staging, rack/stack coordination, RMA, DCIM updates, inventory, and vendor shipments so devices are ready when deployments need them. • Partner with DC Operations, ICT, Hardware, and Network Engineering to identify blockers early, escalate decisively, and keep multi-team efforts on track. • Maintain cutsheets, as-builts, validation results, and lessons learned, and contribute to the deployment playbook that lets the team scale. • Provide operational support during and after deployments: incident response, troubleshooting, and break-fix.
• You've done hands-on data center network engineering and understand modern fabrics (EVPN/VXLAN, BGP, CLOS), having configured production infrastructure. • You thrive in the field, equally comfortable pulling cable, configuring switches, and troubleshooting the optical layer, and you execute with whatever tooling is available. • You troubleshoot methodically across physical and logical layers: OTDR traces, insertion loss, BGP sessions, and connectivity through complex topologies. • You communicate clearly across technical and non-technical teams, document well, and follow through. • You learn fast and take ownership of ramping on new technology, and you'll travel 50 to 60% to onsite deployments. • Bonus: AI fabric turn-up experience. Configuration management and automation tooling. Vendor certifications.
• Competitive total compensation package (salary + equity). • Retirement or pension plan, in line with local norms. • Health, dental, and vision insurance. • Generous PTO policy, in line with local norms.
Apply Now🕒 3 days ago
SRE role focusing on technology resilience and observability within high-complexity environments. Join us to impact technology availability and user experience remotely.
🇺🇸 United States – Remote
⏳ Contract/Temporary
🟡 Mid-level
🟠 Senior
⛑ DevOps & Site Reliability Engineer (SRE)
🗣️🇪🇸 Spanish Required
🕒 June 24
11 - 50
DevOps Engineer focused on building and maintaining Kubernetes clusters for a health care client’s modernization journey. Engaging in designing, developing, and maintaining automated build and release pipelines.
🕒 June 10
Senior DevOps Engineer for cloud-native infrastructure modernization at Blue Coding. Focused on AWS migration and legacy Windows server decommissioning.
🕒 May 29
Senior DevOps & Security Consultant at KATBOTZ, supporting enterprise infrastructure and security initiatives for global projects.
🕒 May 25
11 - 50
Cloud DevOps Engineer Technical Mentor at Udacity providing support to learners. Engage in 1:1 calls, deep-dive sessions, and group Q&A to enrich the learning experience.