
Artificial Intelligence • B2B • SaaS
BentoML is a flexible platform designed to deploy and manage AI/ML models and custom inference pipelines in production. It offers a unified interface for seamless deployment, scaling, and optimization of various models, including large language models (LLMs). The platform empowers users to maintain full control over their AI models by allowing deployments in any environment, whether cloud or on-premise, while ensuring security and compliance without the data ever leaving the user's infrastructure.
August 8
As a Forward Deployed Engineer at BentoML, design and launch production-ready AI solutions. Engage directly with customers to solve real-world problems.
Open Source
Python
August 8
As an Inference Optimization Engineer at BentoML, improve LLM inference efficiency at scale while reducing GPU costs.
Kubernetes
Node.js
Open Source
August 8
Join BentoML to architect and operate Kubernetes clusters for AI services globally. Drive infrastructure choices as a Senior Site Reliability Engineer.
AWS
Azure
Cloud
Flux
Grafana
Kubernetes
Linux
Open Source
Oracle
Prometheus
Terraform