Senior Machine Learning Engineer – Inference Platform

🕒 Yesterday

Apply Now
Find Similar Remote Jobs

📊 Check your resume score for this job

Improve your chances of getting an interview by checking your resume score before you apply.

Logo of Wizard

Wizard

51 - 200 employees

🤖 Artificial Intelligence

🛍️ eCommerce

🛒 Retail

💰 $50M Series A on 2021-10

Artificial Intelligence • eCommerce • Retail

Wizard AI is a company that provides a tailored shopping experience powered by artificial intelligence. It offers a unique text-based service that curates product recommendations across the internet, utilizing AI models that understand user preferences to predict needs and make personalized suggestions. Wizard AI simplifies shopping by finding, ordering, and tracking products, as well as managing returns through SMS communication. This service is designed to save time and improve the convenience of online shopping by efficiently juggling product data, customer reviews, and other digital content to make informed recommendations for customers.

📋 Description

• Own and evolve our multi-engine inference platform, supporting a variety of model types and serving requirements. • Build and improve production ML pipelines — taking models from experimentation to reliable, high-throughput serving. • Define and implement model versioning, rollout, rollback, and lifecycle management strategies that ensure reproducibility and operational reliability. • Define and enforce serving-layer SLAs, including latency, availability, GPU utilization, Time-to-First-Token (TTFT), and Inter-Token Latency (ITL). • Build observability, monitoring, alerting, and operational tooling for production inference systems. • Apply software engineering best practices, including testing, CI/CD integration, and reproducibility across ML workflows. • Optimize inference performance through efficient resource utilization, hardware-aware serving strategies, and cost-conscious infrastructure design. • Ensure ML serving systems are secure, scalable, and operationally resilient. • Partner with ML, Data, Product, and DevOps teams to turn ideas into production systems, driving the technical decisions on serving and scale.

🎯 Requirements

• Bachelor's or Master's degree in Computer Science, Data Science, Engineering, or a related field, or equivalent practical experience. • 5–8+ years of experience in Software Engineering, ML Engineering, Platform Engineering, or Infrastructure Engineering, with direct ownership of production ML serving systems. • Hands-on experience running an LLM serving engine (vLLM, TGI, TensorRT-LLM, or SGLang) in production under real load — not just managed or hosted endpoints. • Strong Python skills and software engineering fundamentals, combined with deep systems and infrastructure knowledge. • Experience with cloud platforms such as AWS, GCP, or Azure, and familiarity with ML lifecycle tooling, experimentation platforms, and model registries. • Strong grasp of inference performance — continuous batching, KV-cache and GPU-memory behavior, quantization, and CPU-versus-GPU bottlenecks — with the instinct to profile before tuning. • Experience serving heterogeneous workloads, including LLMs, embedding models, and extraction models, each with distinct latency, throughput, and scaling requirements. • Demonstrated ability to balance latency, throughput, reliability, and infrastructure cost while operating production-scale ML systems. • Experience in high-growth startup environments and comfort operating in fast-moving, evolving technical landscapes.

🏖️ Benefits

• Health insurance • Flexible work arrangements • Professional development opportunities

Apply Now

Similar Jobs

🕒 Yesterday

Airbnb

5001 - 10000

👥 B2C

🛍️ eCommerce

Senior Machine Learning Engineer developing AI and data products leveraging Airbnb’s massive datasets. Building competitive intelligence systems for understanding the global travel market.

Java

Python

PyTorch

Scala

Tensorflow

🕒 Yesterday

NCC

51 - 200

💸 Finance

☁️ SaaS

AI/ML Engineer designing intelligent systems that enhance dealer workflows and insights for automotive SaaS. This role focuses on building data-driven systems that improve decision-making and optimize processes.

AWS

Cloud

Python

🕒 Yesterday

Amgen

10,000+ employees

🧬 Biotechnology

💊 Pharmaceuticals

🔬 Science

Senior Machine Learning Engineer building high-impact AI proofs-of-concept at Amgen. Partnering with product teams and applying modern AI technologies and architectures to deliver scalable solutions.

AWS

Cloud

Python

🕒 2 days ago

MaintainX

51 - 200

Senior Applied Machine Learning Engineer at MaintainX focusing on predictive maintenance and asset intelligence. Leading technical direction and mentoring engineers in ML system architecture and implementation.

AWS

Cloud

Docker

Kubernetes

Python

PyTorch

Tensorflow

🕒 2 days ago

Mitek Systems

201 - 500

🤖 Artificial Intelligence

📋 Compliance

🔐 Security

Sr. Machine Learning Engineer developing models for identity verification, focusing on biometric authentication and computer vision initiatives in a flexible remote environment.

Python

PyTorch

Tensorflow