Principal DevOps Engineer

November 22

Apply Now
Logo of SambaNova Systems

SambaNova Systems

Artificial Intelligence • Hardware • Enterprise

SambaNova Systems is a technology company focused on advancing artificial intelligence and deep learning. They offer an enterprise-grade AI platform that is purpose-built for generative AI, enabling organizations across various sectors to rapidly deploy state-of-the-art AI capabilities. SambaNova's platform integrates from hardware chips to AI models, providing powerful solutions for high-performance computing and AI workloads. Their technology is utilized in fields such as science, public sector applications, and the development of sovereign AI solutions. The company's innovations include alternatives to GPUs, high-speed AI inference, and scalable AI hardware systems.

📋 Description

• Take ownership of our existing Bazel ecosystem, including RBE setup, maintenance, and troubleshooting. • Ensure the stability, scalability, and performance of our CI/CD pipelines. • Collaborate with development teams to optimize build and test processes. • Maintain and improve our CircleCI setup, including workflow optimization and configuration management. • Manage Python package dependencies and ensure seamless integration with our CI/CD pipelines. • Work with the development team to implement best practices for package management and dependency management. • Familiarize yourself with our GAR and JFrog Artifact Management setup and optimize its usage. • Collaborate with the engineering team to implement infrastructure changes and improvements.

🎯 Requirements

• 2+ years of experience in DevOps or Infra. • Experience in managing dependencies in large scale projects. • Experience with Python Package Management and RPM packages. • Experience with Google Artifact Registry (GAR) and/or JFrog Artifact Management. • Experience with Linux/Unix systems and command-line interfaces. • Strong scripting skills (e.g., Python, Bash, etc.). • Excellent problem-solving skills and attention to detail. • Ability to work collaboratively with cross-functional teams. • Experience maintaining and troubleshooting Bazel ecosystems, especially in C++ and Python. • Familiarity with containerization (e.g., Docker) and orchestration (e.g., Kubernetes). • Familiarity with AWS/GCloud. • Experience with other CI/CD tools (e.g., Jenkins, GitLab CI/CD, etc.) preferably CircleCI and Jenkins. • Knowledge of software development best practices and coding standards.

🏖️ Benefits

• 95% premium coverage for employee medical insurance • 77% premium coverage for dependents • Health Savings Account (HSA) with employer contribution • Dental insurance • Vision insurance • Short/Long term Disability insurance • Basic Life insurance • Voluntary Life insurance • AD&D insurance plans • Flexible Spending Account (FSA) options including Health Care, Limited Purpose, and Dependent Care • Subscription to Headspace • Gympass+ membership with access to physical gyms • One Medical membership • Counseling services with an Employee Assistance Program • Well-being benefits available to you and your dependents

Apply Now

Similar Jobs

November 21

Site Reliability Engineer ensuring the reliability and performance of systems at Alpaca. Collaborate with teams to implement solutions and improve the infrastructure.

Distributed Systems

Kafka

Kubernetes

Linux

Prometheus

RabbitMQ

Go

November 20

Global Head of Site Reliability Engineering at Socure, leading end-to-end reliability for identity verification platform. Focused on high-impact systems and advanced engineering practices.

AWS

Cloud

November 19

Staff Site Reliability Engineer at Stord responsible for infrastructure management and production system reliability. Focusing on GCP, automation, and mentoring within a dynamic team.

Ansible

Chef

Cloud

Distributed Systems

Docker

Google Cloud Platform

Grafana

Java

Jenkins

Kubernetes

Prometheus

Puppet

Python

Terraform

Go

November 18

Staff Cloud DevOps Engineer for Cleerly, leading cloud infrastructure and enhancing systems for AI-powered diagnostics. Focused on continuous integration, software delivery, and mentoring junior engineers.

AWS

Cloud

DynamoDB

EC2

JavaScript

Kubernetes

Linux

Node.js

Python

Terraform

November 14

Staff Software Engineer overseeing operational support of SAP BTP CPI applications at NBCUniversal. Leading offshore teams and collaborating on production deployments.

Built by Lior Neu-ner. I'd love to hear your feedback — Get in touch via DM or support@remoterocketship.com