Platform Support Architect

🕒 Yesterday

Apply Now
Find Similar Remote Jobs

📊 Check your resume score for this job

Improve your chances of getting an interview by checking your resume score before you apply.

Logo of DDN

DDN

1001 - 5000 employees

Founded 1998

đŸ€– Artificial Intelligence

💰 $10M Funding Round on 2011-06

Artificial Intelligence ‱ Data Center and Cloud Computing ‱ High Performance Computing

DDN is a global leader in AI data intelligence solutions, providing high-performance computing and sophisticated data management technologies. With a focus on accelerating AI deployments and advanced data analytics, DDN's products, including the Data Intelligence Platform and advanced storage systems, serve diverse sectors such as healthcare, financial services, and government. DDN is committed to transforming enterprise data infrastructure to leverage the full potential of AI and drive operational efficiency.

📋 Description

‱ Act as the primary NVIDIA AI Enterprise and vector database solutions expert for HyperPOD customer environments ‱ Own complex end-to-end triage across GPU, NVAIE services, vector DB, Kubernetes, Docker, high-speed networking, and Infinia storage ‱ Diagnose and resolve performance bottlenecks in RAG and agentic AI workflows ‱ Collect and interpret logs and telemetry across Linux, containers, Kubernetes, GPU stack, vector DB, and storage/networking ‱ Author and maintain support triage runbooks and checklists for HyperPOD ‱ Define and validate unified diagnostics bundles ‱ Collaborate with observability and tools teams to shape dashboards ‱ Build hands-on labs and PoCs that mirror customer use cases ‱ Develop reusable technical assets

🎯 Requirements

‱ 5+ years in Linux-based infrastructure roles (SRE, MLOps, platform engineering, or L2/L3 support) ‱ Strong hands-on experience with containers and Kubernetes ‱ Demonstrated experience operating GPU-accelerated workloads in production ‱ Familiarity with DGX/HGX or similar GPU cluster platforms ‱ Practical experience with AI storage and networking for HPC/AI clusters ‱ Experience with one or more vector databases (Milvus, Qdrant, Pinecone, etc.) ‱ Solid understanding of RAG and Generative AI workflows ‱ Familiarity with NVIDIA AI Enterprise components and toolchain ‱ Experience designing, operating, or supporting MLOps/GenAI pipelines ‱ Strong diagnostic skills across Linux, containers, Kubernetes, GPUs, storage, and networking ‱ Track record of building reusable technical assets that improve support readiness and partner/customer success ‱ Excellent communication skills, capable of clearly explaining complex AI platform topics to both engineers and executive stakeholders.

Apply Now

Similar Jobs

🕒 Yesterday

Horizon3.ai

51 - 200

Senior Technical Support Engineer providing advanced technical support for enterprise customers at Horizon3.ai. Collaborating with cross-functional teams to resolve complex technical issues in cloud and security environments.

AWS

Azure

Cloud

Docker

Google Cloud Platform

Linux

Python

🕒 Yesterday

Westinghouse Electric Company

5001 - 10000

⚡ Energy

Commissioning Support Engineer responsible for planning and implementing commissioning activities for nuclear power plants. Ensuring compliance with design specifications and regulatory requirements in a collaborative team environment.

🕒 Yesterday

Tobii Dynavox

201 - 500

⚕ Healthcare Insurance

đŸ‘„ HR Tech

🧘 Wellness

Technical Support Rep responsible for providing bilingual assistance in a call center environment. Answering calls, emails, and chat requests while documenting interactions and troubleshooting issues.

đŸ—ŁïžđŸ‡Ș🇾 Spanish Required

🕒 Yesterday

ATS Corporation

5001 - 10000

🚀 Aerospace

Technical Support Engineer providing remote troubleshooting for SP Scientific freeze dryer systems. Collaborating with global teams and ensuring customer satisfaction through effective technical support.

🕒 Yesterday

Automox

201 - 500

☁ SaaS

🔐 Security

🏱 Enterprise

Associate Technical Support Engineer at Automox supporting customers with cloud-native endpoint management platform. Troubleshooting across Windows, macOS, and Linux with strong customer communication skills.

DNS

Firewalls

Linux

MacOS