Principal Operations Engineer – Hardware, Data Center Operations

🔥 0 minutes ago

Apply Now
Find Similar Remote Jobs

📊 Check your resume score for this job

Improve your chances of getting an interview by checking your resume score before you apply.

Logo of FluidStack

FluidStack

11 - 50 employees

🤖 Artificial Intelligence

Artificial Intelligence • Cloud Computing

FluidStack is a company that provides GPU supercomputing infrastructure for AI labs. It offers on-demand access to thousands of Nvidia GPUs, enabling large-scale AI training and inference. The company specializes in deploying and managing large GPU clusters with support for technologies like Kubernetes and Slurm, ensuring high availability and excellent support. FluidStack provides a fully managed cloud infrastructure, helping AI companies to focus on developing models without worrying about the underlying hardware. They emphasize performance and cost-efficiency, offering services that scale to thousands of GPUs with high uptime and rapid response times.

📋 Description

• Serve as the most senior technical authority for the operational hardware fleet across our hyperscale AI data center portfolio. • Ensure that the GPU systems, servers, and supporting hardware we deploy at scale are operated, maintained, and continuously improved. • Lead site assessments and operational audits. • Drive the technical readiness of teams ahead of site activation. • Review hardware platforms and integration designs from an operational lens. • Feed operational learnings back into the hardware engineering, deployment, and supply chain organizations as we shift toward a productized, repeatable build model.

🎯 Requirements

• 10+ years of hands-on experience operating mission-critical hardware infrastructure, with at least 5 years as the senior technical voice on a site, campus, or fleet. • Data center operations experience strongly preferred; hyperscale, large HPC, cloud, or other mission-critical compute infrastructure experience considered. • Deep working command of GPU systems, server platforms, storage infrastructure, firmware lifecycle management, and hardware diagnostics — earned in the field, not from a textbook. • Demonstrated ability to author, approve, and execute high-risk MOPs and change records in live production environments. • A track record of leading root cause analysis on significant hardware events and driving corrective actions to closure. • A track record of holding OEMs, ODMs, service vendors, and deployment partners accountable — you know how to enforce a standard without burning the relationship. • Strong written communication: operational health assessments, RCAs, procedure reviews, and design review feedback are second nature. • Comfort operating as the senior technical voice across operations, hardware engineering, network, facilities, supply chain, and customer-facing teams. • Willingness to travel extensively across the fleet. 50-75%.

🏖️ Benefits

• Competitive total compensation package (salary + equity). • Retirement or pension plan, in line with local norms. • Health, dental, and vision insurance. • Generous PTO policy, in line with local norms.

Apply Now

Similar Jobs

🔥 2 hours ago

Viatris

10,000+ employees

💊 Pharmaceuticals

⚕️ Healthcare Insurance

Director leading Lean 4.0 initiatives in US and EU pharmaceutical manufacturing. Overseeing digital transformation and operational excellence in a global healthcare company.

🔥 9 hours ago

IICRC

5001 - 10000

🤝 Non-profit

📚 Education

📋 Compliance

Commercial Operations Specialist managing commercial jobs and customer communication for a franchise organization. Responsible for estimating, job oversight, and maintaining customer relationships in a fully remote capacity.

🔥 10 hours ago

Operations Supervisor leading daily operations for building engineering at JLL, ensuring safe and efficient service delivery to clients. Collaborating with teams and overseeing technical staff performance and operations.

🔥 12 hours ago

Mazorca Facilitation

1 - 10

🤝 Non-profit

🌍 Social Impact

Operations Manager/Director overseeing finance, human resources, and operations at Civic TN for strategic alignment. Ensures organizational excellence and compliance in a small nonprofit setting.

🔥 17 hours ago

Solace

1 - 10

⚕️ Healthcare Insurance

🧘 Wellness

🌍 Social Impact

Director of Strategy & Operations at Solace leading operational initiatives to enhance scale and efficiency in healthcare. Partnering with leadership to drive impactful change and improve CX operations.