HPC Solutions Engineer

🔥 0 minutes ago

Apply Now
Find Similar Remote Jobs

📊 Check your resume score for this job

Improve your chances of getting an interview by checking your resume score before you apply.

Logo of Hydra Host

Hydra Host

11 - 50 employees

🔧 Hardware

🏢 Enterprise

🤖 Artificial Intelligence

💰 $10M Seed Round on 2022-04

Hardware • Enterprise • Artificial Intelligence

Hydra Host is a provider of high-performance computing solutions, offering dedicated bare metal GPU server access optimized for AI and HPC workloads. Their platform allows users to access and rent top-tier GPUs globally, providing unparalleled performance, security, and customization. Hydra Host's infrastructure includes a marketplace, known as Brokkr, that offers a wide array of GPU configurations and solutions tailored for mission-critical applications such as AI, big data, and machine learning. Through their robust, secure, and scalable solutions, Hydra Host ensures customers enjoy full control over their server environments, with options for scalability and future-readiness. The company's offerings are trusted by leading firms seeking efficient and innovative computing solutions.

📋 Description

• Work with customers in technical discovery to help define requirements and deliverables for their use cases and help them to effectively utilize distributed GPU computing resources. • Identify and recommend the best tools for each customer, while building boilerplate and reference implementations that can be reused for subsequent customers. • Manage GPU clusters and coordinate with IT to ensure efficient operation of NVIDIA GPUs, utilizing technologies such as InfiniBand for high-speed networking. • Oversee the deployment and maintenance of machine learning environments using virtual storage solutions and distributed computing (HPC) tools such as SLURM. • Automate infrastructure provisioning and management using tools such as Ansible and Terraform. • Collaborate with data scientists and engineers to ensure seamless integration of ML models into production environments. • Conduct performance tuning and optimization of systems to maximize throughput and reduce latency. • Stay current with the latest industry trends in machine learning technologies and HPC to ensure the use of best practices in infrastructure setup and model development. • Document and maintain operational procedures and system configurations.

🎯 Requirements

• 7+ years of high performance compute, distributed machine learning, GPU computing, and/or system architecture experience • Proficient in managing NVIDIA GPU environments and a familiarity with GPU computing frameworks and libraries • Strong experience with high-speed networking technologies, specifically InfiniBand • Experience with HPC job schedulers, preferably SLURM • Expertise in automating environment setup and maintenance using Ansible and Terraform • Demonstrated ability in deploying and managing virtual storage solutions • Strong coding skills in Python and familiarity with machine learning libraries and frameworks • Excellent problem-solving, communication, and teamwork skills.

🏖️ Benefits

• You will work with the most diverse hardware configurations and locations available to anyone in the industry as well as cutting edge GPU use cases • This role is fully remote with a high accountability and high agency culture • This role offers a competitive salary, equity and benefits • This role offers flexible PTO

Apply Now

Similar Jobs

🔥 1 hour ago

Temporal Technologies

51 - 200

☁️ SaaS

Senior Solutions Architect supporting Commercial Sales team in a consumption-based business. Collaborating with engineering and development teams to enable customer success and expand Temporal usage.

🔥 2 hours ago

ShipBob, Inc.

501 - 1000

🛍️ eCommerce

☁️ SaaS

Solutions Architect I at ShipBob managing integrations for large ecommerce merchants. Collaborating with sales and partnerships teams to ensure integration success with leading ecommerce technologies.

🇺🇸 United States – Remote

💵 $95k - $110k / year

💰 $200M Series E on 2021-06

⏰ Full Time

🟡 Mid-level

🟠 Senior

💻 Solutions Engineer

🔥 2 hours ago

Databricks

1001 - 5000

🤖 Artificial Intelligence

🏢 Enterprise

☁️ SaaS

Senior Solutions Architect at Databricks overseeing customer engagements and developing data analytics solutions. Leading teams and influencing stakeholders while promoting the Databricks Data Intelligence Platform.

🔥 3 hours ago

Rightpoint

501 - 1000

🤝 B2B

☁️ SaaS

Sitecore Solution Architect at Genpact, delivering enterprise digital experience platforms and mentoring development teams. Leading technical strategy and ensuring successful project delivery for clients.

🔥 3 hours ago

TELUS Digital

201 - 500

🤝 B2B

🤖 Artificial Intelligence

☁️ SaaS

Solution Engineer (Pre-Sales Engineer) at TELUS Digital providing technical expertise to drive business growth. Collaborating with Sales to design and deliver tailored solutions across various industries.