Search Remote Jobs

AI Infrastructure, Platform Operations Engineer

đŸ”„ 5 minutes ago

đŸ‡ȘđŸ‡ș Europe – Remote

⏰ Full Time

🟡 Mid-level

🟠 Senior

đŸ—ïž Platform Engineer

Apply Now
Find Similar Remote Jobs

📊 Check your resume score for this job

Improve your chances of getting an interview by checking your resume score before you apply.

Logo of Mirantis

Mirantis

501 - 1000 employees

🏱 Enterprise

☁ SaaS

Cloud Computing ‱ Enterprise ‱ SaaS

Mirantis is a company that specializes in container management and cloud infrastructure solutions. It offers a range of products, including Mirantis Kubernetes Engine (MKE), Mirantis OpenStack for Kubernetes (MOSK), and Mirantis Container Cloud (MCC), which provide enterprise-level Kubernetes and container management platforms. Mirantis also develops tools for secure software supply chains, such as the Mirantis Container Runtime (MCR) and Mirantis Secure Registry (MSR). As an advocate for open source technologies, Mirantis supports various projects and provides resources like Lens Desktop, a popular Kubernetes IDE, and technical support for enterprises adopting cloud-native technologies. Their solutions cater to sectors such as public services, financial services, and broader SaaS and technology services industries.

📋 Description

‱ Monitor, operate, and support production AI infrastructure platforms. ‱ Investigate and resolve infrastructure, networking, hardware, and platform-related incidents. ‱ Support NVIDIA GPU infrastructure and associated platform services. ‱ Monitor and troubleshoot Kubernetes-based environments. ‱ Investigate performance, availability, and reliability issues across infrastructure and platform components. ‱ Collaborate with engineering teams, hardware vendors, datacenter personnel, and service delivery teams to resolve technical issues. ‱ Participate in incident response, root cause analysis, and operational improvement activities. ‱ Contribute to improvements in monitoring, observability, automation, and operational processes. ‱ Maintain operational documentation, runbooks, and knowledge articles.

🎯 Requirements

‱ 3+ years of experience in infrastructure operations, platform operations, network operations, site reliability engineering, cloud operations, datacenter operations, or related technical roles. ‱ Strong Linux administration and troubleshooting skills. ‱ Good understanding of networking concepts and experience diagnosing infrastructure-related issues. ‱ Working knowledge of Kubernetes in production environments. ‱ Experience supporting production infrastructure and services. ‱ Strong analytical and problem-solving skills. ‱ Experience working within structured operational and incident management processes. ‱ Excellent communication and collaboration skills. ‱ Ability to work within a shift-based operational environment. ‱ Experience in one or more of the following areas is highly desirable: NVIDIA GPU infrastructure and accelerated computing platforms. ‱ InfiniBand networking and NVIDIA UFM. ‱ Kubernetes platform operations. ‱ AI infrastructure or HPC environments. ‱ Site Reliability Engineering (SRE) or Platform Engineering. ‱ Observability platforms such as Grafana, Prometheus, ELK, or OpenTelemetry. ‱ Infrastructure automation technologies and Infrastructure-as-Code practices. ‱ Large-scale distributed systems and production platforms.

đŸ–ïž Benefits

‱ Work with some of the most advanced AI infrastructure environments in production today. ‱ Gain exposure to NVIDIA GPU technologies, Kubernetes platforms, and high-performance networking environments. ‱ Help define how next-generation AI infrastructure is operated and supported. ‱ Be part of a team shaping the future of AI-powered operations through k0rdent AI. ‱ Join a growing organisation investing heavily in AI infrastructure and platform services.

Apply Now

Similar Jobs

🕒 June 12

Vira Games

51 - 200

🎼 Gaming

đŸ‘„ B2C

Senior Platform Engineer designing and developing backend services for a gaming company. Focusing on GaaS platform architecture, quality assurance, and infrastructure solutions.

đŸ‡ȘđŸ‡ș Europe – Remote

⏰ Full Time

🟠 Senior

đŸ—ïž Platform Engineer

đŸ—ŁïžđŸ‡ș🇩 Ukrainian Required

🕒 May 8

bloomon

51 - 200

🛒 Retail

đŸ›ïž eCommerce

Platform Engineer working across technology domains at Bloom & Wild. Enhancing e-commerce, data, and infrastructure solutions with a focus on autonomy and innovation.

đŸ‡ȘđŸ‡ș Europe – Remote

💰 Series C on 2019-03

⏰ Full Time

🟡 Mid-level

🟠 Senior

đŸ—ïž Platform Engineer

🕒 May 5

saas.group

51 - 200

☁ SaaS

🏱 Enterprise

đŸ€ B2B

Senior Platform Engineer for ScraperAPI, managing and consolidating infrastructure for high-performance web scraping solutions. Collaborate with engineering teams to drive significant platform improvements.

đŸ‡ȘđŸ‡ș Europe – Remote

⏰ Full Time

🟠 Senior

đŸ—ïž Platform Engineer

🕒 March 20

TD SYNNEX

10,000+ employees

🏱 Enterprise

☁ SaaS

📡 Telecommunications

Senior Platform Engineer architecting multi-cloud infrastructure for AI-driven applications at TD SYNNEX. Focusing on automation and collaboration between Developers, Business, and Operations.

đŸ‡ȘđŸ‡ș Europe – Remote

⏰ Full Time

🟠 Senior

đŸ—ïž Platform Engineer

🕒 March 12

Polar

1 - 10

💳 Fintech

☁ SaaS

🔌 API

Senior Platform Engineer architecting and evolving the Polar platform for high-velocity startups. Designing systems emphasizing reliability and scalability in financial workflows across various engineering layers.

đŸ‡ȘđŸ‡ș Europe – Remote

⏰ Full Time

🟠 Senior

đŸ—ïž Platform Engineer