Senior Platform Engineer – MLOps

November 4

Apply Now
Logo of Quantiphi

Quantiphi

Artificial Intelligence • Enterprise • Education

Quantiphi is a leading AI-first digital engineering company that leverages a decade of industry expertise to empower businesses through scalable, secure, and adaptable AI solutions. By integrating cutting-edge technology with real-world applications, Quantiphi transforms organizations across various sectors including healthcare, finance, education, and retail. Their services span AI applications, data analytics, cloud infrastructure modernization, and custom AI implementations. Quantiphi partners with technology giants like AWS, Google Cloud, NVIDIA, and others to drive AI adoption and deliver transformational opportunities for enterprises.

1001 - 5000 employees

Founded 2013

🤖 Artificial Intelligence

🏢 Enterprise

📚 Education

💰 Series A on 2019-12

📋 Description

• Orchestrating LLM Workflows & Development: Design, implement, and scale the underlying platform that supports GenAI workloads, be it for real-time or batch. • LLMOps (LLM Operations): Build and manage operational pipelines for training, fine-tuning, and deploying LLMs such as Llama, Mistral etc, GPT-3/4, BERT, or similar. • GPU Optimization: Optimize GPU utilization and resource management for AI workloads, ensuring efficient scaling, low latency, and high throughput in model training and inference. • Infrastructure Design & Automation: Design, deploy, and automate scalable, secure, and cost-effective infrastructure for training and running AI models. • Platform Reliability & Monitoring: Implement robust monitoring systems to track the performance, health, and efficiency of deployed AI models and workflows. • Maintain Knowledge Base: Good knowledge of database concepts ranging from performance tuning, RBAC, shading, along with exposure to different types of databases from relational to object & vector databases is preferred. • Collaboration with AI/ML Teams: Work closely with data scientists, machine learning engineers, and product teams to understand and support their platform requirements. • Security & Compliance: Ensure that platform infrastructure is secure, compliant with organizational policies, and follows best practices for managing sensitive data and AI model deployment.

🎯 Requirements

• 4+ years of experience in platform engineering, DevOps, or systems engineering with a strong focus on machine learning and AI workloads. • Proven experience working with LLM workflows and GPU-based machine learning infrastructure. • Hands-on experience in managing distributed computing systems, training large-scale models, and deploying AI systems in cloud environments. • Strong knowledge of GPU architectures (e.g., NVIDIA A100, V100, etc.), multi-GPU systems, and optimization techniques for AI workloads. • Proficiency in Linux systems and command-line tools. • Strong scripting skills (Python, Bash, or similar). • Expertise in containerization and orchestration technologies (e.g., Docker, Kubernetes, Helm). • Experience with cloud platforms (AWS, GCP, Azure), tools such as Terraform, /Terragrunt, or similar infrastructure-as-code solutions, and exposure to automation of CI/CD pipelines using Jenkins/Gitlab/Github, etc. • Familiarity with machine learning frameworks (TensorFlow, PyTorch, etc.) and deep learning model deployment pipelines. • Exposure to vLLM or NVIDIA software stack for data & model management is preferred. • Expertise in performance optimization tools and techniques for GPUs, including memory management, parallel processing, and hardware acceleration.

🏖️ Benefits

• Health insurance • 401(k) matching • Flexible working hours • Paid time off • Professional development opportunities

Apply Now

Similar Jobs

October 31

Wurl

51 - 200

📱 Media

Senior Platform Engineer designing and operating systems for ad-serving and streaming platforms at Wurl. Focusing on building infrastructure for high-throughput, low-latency workloads.

🇺🇸 United States – Remote

💵 $144k - $216k / year

⏰ Full Time

🟠 Senior

🏗️ Platform Engineer

🦅 H1B Visa Sponsor

October 31

NVIDIA

10,000+ employees

🤖 Artificial Intelligence

🎮 Gaming

Senior Client Platform Engineer working on macOS at NVIDIA automating device configuration and improving platform reliability. Collaborating cross-functionally and leveraging DevOps practices for client platforms.

🇺🇸 United States – Remote

💵 $168k - $264.5k / year

⏰ Full Time

🟠 Senior

🏗️ Platform Engineer

🦅 H1B Visa Sponsor

Chef

Jamf

MacOS

Python

October 31

Power Platform Developer working remotely to enhance business operations with Microsoft tools. Focused on workflow automation and data-driven HR solutions using Power Platform tools.

🇺🇸 United States – Remote

⏰ Full Time

🟡 Mid-level

🟠 Senior

🏗️ Platform Engineer

October 31

TWO95 International, Inc

51 - 200

☁️ SaaS

🔒 Cybersecurity

🤖 Artificial Intelligence

Senior Fullstack Developer responsible for leading development and design of web applications in an agile environment for corporate projects.

🇺🇸 United States – Remote

⏰ Full Time

🟠 Senior

🏗️ Platform Engineer

October 30

HubSpot

1001 - 5000

🤝 B2B

☁️ SaaS

Senior Automation Platform Engineer designing and delivering automations using Workato for HubSpot's employee experience. Join the Intelligent Automation team to connect critical internal systems.

🇺🇸 United States – Remote

💵 $113k - $168k / year

⏰ Full Time

🟠 Senior

🏗️ Platform Engineer

🦅 H1B Visa Sponsor

Developed by Lior Neu-ner. I'd love to hear your feedback — Get in touch via DM or support@remoterocketship.com