
AI ⢠Enterprise ⢠SaaS
Nebius Group is building one of the worldâs leading AI infrastructure companies, focusing on providing the necessary compute, storage, and tools for developers in the AI space. Based in Europe and listed on Nasdaq, Nebius has a global presence with R&D centers across Europe, North America, and Israel. The company's primary offering is an AI-centric cloud platform designed for intensive AI workloads, complemented by various other businesses involved in generative AI development, edtech, and autonomous technology.
August 19

AI ⢠Enterprise ⢠SaaS
Nebius Group is building one of the worldâs leading AI infrastructure companies, focusing on providing the necessary compute, storage, and tools for developers in the AI space. Based in Europe and listed on Nasdaq, Nebius has a global presence with R&D centers across Europe, North America, and Israel. The company's primary offering is an AI-centric cloud platform designed for intensive AI workloads, complemented by various other businesses involved in generative AI development, edtech, and autonomous technology.
⢠Cluster Design: Architect scalable GPU cluster topologies including compute nodes, interconnect (InfiniBand, Ethernet), storage, and control planes. ⢠Performance Modeling: Analyze AI/ML workloads (e.g. LLM training, inference) to inform design tradeoffs across latency, bandwidth, and GPU density. ⢠Network Architecture: Align with network architect relevant design and validate low-latency, high-throughput interconnects (e.g., InfiniBand HDR/NDR, RoCEv2) at POD and DC scale. ⢠Storage Integration: Work with storage teams to optimize performance for training datasets, checkpointing, and others. ⢠Reliability & Monitoring: Understand and analyze signal from monitoring systems to the detect flows in design ⢠Collaboration: Partner with site reliability, networking, storage, and DC engineering teams to operationalize and scale your architecture.
⢠5+ years of experience designing clusters. ⢠Deep understanding of modern GPU architecture (NVIDIA, AMD, etc.). ⢠Experience with HPC interconnects (InfiniBand & RoCE). ⢠Solid background in systems architecture, networking, and hardware reliability. ⢠Experience in scripting for automation and telemetry pipelines (Python, Go, etc.)
⢠Competitive salary and comprehensive benefits package. ⢠Opportunities for professional growth within Nebius. ⢠Hybrid working arrangements. ⢠A dynamic and collaborative work environment that values initiative and innovation.
Apply Now