
51 - 200 employees
Founded 2015
🤖 Artificial Intelligence
☁️ SaaS
🔌 API
💰 $47M Series B on 2022-11
Artificial Intelligence • SaaS • API
Deepgram is a leading voice AI company that provides powerful APIs for speech-to-text, text-to-speech, and language understanding applications. Their platform enables developers to build sophisticated voice AI solutions for use cases such as contact centers, medical transcription, conversational AI, and more. Known for unmatched accuracy, speed, and cost-effectiveness, Deepgram's technology is trusted by top enterprises and startups worldwide. By offering real-time and highly accurate transcription capabilities, Deepgram helps businesses gain insights from voice data, making it an essential tool for transforming voice interactions.
🕒 March 10
🇺🇸 United States – Remote
💵 $150k - $220k / year
⏰ Full Time
🟡 Mid-level
🟠 Senior
⛑ DevOps & Site Reliability Engineer (SRE)
🦅 H1B Visa Sponsor
Improve your chances of getting an interview by checking your resume score before you apply.

51 - 200 employees
Founded 2015
🤖 Artificial Intelligence
☁️ SaaS
🔌 API
💰 $47M Series B on 2022-11
Artificial Intelligence • SaaS • API
Deepgram is a leading voice AI company that provides powerful APIs for speech-to-text, text-to-speech, and language understanding applications. Their platform enables developers to build sophisticated voice AI solutions for use cases such as contact centers, medical transcription, conversational AI, and more. Known for unmatched accuracy, speed, and cost-effectiveness, Deepgram's technology is trusted by top enterprises and startups worldwide. By offering real-time and highly accurate transcription capabilities, Deepgram helps businesses gain insights from voice data, making it an essential tool for transforming voice interactions.
• Architect and maintain our core computing platform using Kubernetes on AWS and on-premise, providing a stable, scalable environment for all applications and services. • Develop and manage our entire infrastructure using Infrastructure-as-Code (IaC) principles with Terraform, ensuring our environments are reproducible, versioned, and automated. • Design, build, and optimize our AI/ML job scheduling and orchestration systems, integrating Slurm with our Kubernetes clusters to efficiently manage GPU resources. • Provision, manage, and maintain our on-premise bare metal server infrastructure for high-performance GPU computing. • Implement and manage the platform's networking (CNI, service mesh) and storage (CSI, S3) solutions to support high-throughput, low-latency workloads across hybrid environments. • Develop a comprehensive observability stack (monitoring, logging, tracing) to ensure platform health, and create automation for operational tasks, incident response, and performance tuning. • Collaborate with AI researchers and ML engineers to understand their infrastructure needs and build the tools and workflows that accelerate their development cycle. • Automate the life cycle of single-tenant, managed deployments
• 5+ years of experience in Platform Engineering, DevOps, or Site Reliability Engineering (SRE) • Proven, hands-on experience building and managing production infrastructure with Terraform • Expert-level knowledge of Kubernetes architecture and operations in a large-scale environment • Experience with high-performance compute (HPC) job schedulers, specifically Slurm, for managing GPU-intensive AI workloads • Experience managing bare metal infrastructure, including server provisioning (e.g., PXE boot, MAAS), configuration, and lifecycle management • Strong scripting and automation skills (e.g., Python, Go, Bash)
• Medical, dental, vision benefits • Annual wellness stipend • Mental health support • Life, STD, LTD Income Insurance Plans • Unlimited PTO • Generous paid parental leave • Flexible schedule • 12 Paid US company holidays • Quarterly personal productivity stipend • One-time stipend for home office upgrades • 401(k) plan with company match • Tax Savings Programs • Learning / Education stipend • Participation in talks and conferences • Employee Resource Groups • AI enablement workshops / sessions
Apply Now🕒 March 9
DevOps Engineer optimizing Windows-based web services in AWS for healthcare organization. Collaborating on file processing and ensuring compliance with healthcare regulations.
🕒 March 7
Expert DevOps / DevSecOps supporting Generative AI initiatives at Inetum for digital transformation in the United States. Designing high-value GenAI use cases and integrating new tools and practices.
🇺🇸 United States – Remote
💰 Post-IPO Equity on 2007-03
⏰ Full Time
🟠 Senior
🔴 Lead
⛑ DevOps & Site Reliability Engineer (SRE)
🗣️🇫🇷 French Required
🕒 March 7
Manager II of Site Reliability Engineering at Flywire driving reliability, automation, and performance in cloud infrastructure. Collaborating with Engineering teams to achieve production excellence in a global environment.
🇺🇸 United States – Remote
💵 $160k - $200k / year
💰 $60M Series F on 2021-03
⏰ Full Time
🟡 Mid-level
🟠 Senior
⛑ DevOps & Site Reliability Engineer (SRE)
🦅 H1B Visa Sponsor
🕒 March 5
DevSecOps & Cloud Operations Engineer at North Stone supporting cloud automation, monitoring, and security. Managing CI/CD pipelines and optimizing system performance across cloud platforms.
🕒 March 5
Senior II DevOps Engineer developing and maintaining cloud infrastructures and applications for FedRAMP compliance. Collaborating with teams on network security projects and enhancing product deployment.
🇺🇸 United States – Remote
💵 $112.5k - $202.5k / year
💰 Post-IPO Equity on 2001-07
⏰ Full Time
🟠 Senior
⛑ DevOps & Site Reliability Engineer (SRE)
🦅 H1B Visa Sponsor