Senior Infrastructure Engineer

🕒 April 27

Apply Now
Find Similar Remote Jobs

📊 Check your resume score for this job

Improve your chances of getting an interview by checking your resume score before you apply.

Logo of Voxel51

Voxel51

11 - 50 employees

Founded 2018

🤖 Artificial Intelligence

Artificial Intelligence • Data Curation & Management • Healthcare

Voxel51 is the most powerful Visual AI and computer vision data platform that maximizes AI performance with better data. It helps enterprises unlock the value of their data by streamlining data curation, management, and annotation workflows, leading to increased accuracy and productivity in machine learning projects. With capabilities like multimodal data support and robust model evaluation, Voxel51 enables users to build cleaner datasets and derive better insights faster, all while ensuring enterprise-grade security and compliance.

📋 Description

• Shape the architecture and evolution of Voxel51’s infrastructure to support deployments ranging from individual researchers to Fortune 500 enterprises • Design, build, and scale deployment systems across cloud (GCP, AWS, Azure) and on-premises environments, ensuring reliability, security, and repeatability • Partner with enterprise customers (and our Customer Success Machine Learning Engineers) to deliver and support production-grade deployments in their environments, guiding them through installation, troubleshooting, and scaling • Lead infrastructure initiatives across engineering teams, enabling peers to develop, test, and ship features faster with robust internal tooling and automation • Drive best practices in CI/CD, evolving our pipelines (currently GitHub Actions + Google Cloud Build) and introducing new approaches where they add value • Develop and maintain deployment solutions for Voxel51-hosted environments (GKE) as well as customer on-prem installations (K8s or Docker Compose) • Champion developer productivity, improving workflows for development and automated cloud deployments • Troubleshoot and resolve complex infrastructure issues, spanning build failures, runtime failures, and customer deployment challenges • Anticipate and prevent failures by designing monitoring, alerting, and predictive solutions for both internal and customer environments • Mentor engineers and set technical direction, ensuring Voxel51’s infrastructure remains ahead of customer needs and industry trends

🎯 Requirements

• Deep experience with containerized environments • Building, packaging, and debugging container images • Kubernetes (and Docker Compose) for orchestration • Building, maintaining, and deploying Helm charts • Infrastructure as Code expertise (Terraform, Ansible, or equivalent) • Scripting and automation skills (Bash or similar) • Python expertise, including build and environment management, packaging/distribution, release management, and dependency debugging • CI/CD systems experience, ideally GitHub Actions (we use this today) • Cloud infrastructure knowledge, especially GCP (IAM, VPC, load balancing, ingress/egress routing, proxies, firewall rules) • Database fundamentals, ideally MongoDB or similar NoSQL systems • Observability skills, including designing meaningful monitors, logging, tracing, and alerting • Security best practices, including certificates, service accounts, least privilege, and role assumptions • Troubleshooting ability across complex, distributed systems (including with customers in the loop) • Testing mindset: comfortable with designing and applying different types of tests to validate functionality • Strong communication skills, with the ability to work directly with enterprise customers as well as collaborate across teams in a remote-first, collaborative environment • Adaptability and curiosity, with the ability to ramp quickly on unfamiliar concepts and technologies

🏖️ Benefits

• equity in the form of options • a variety of benefits • the opportunity to grow in an exciting and collaborative environment

Apply Now

Similar Jobs

🕒 April 25

HealthMark Group

501 - 1000

⚕️ Healthcare Insurance

📋 Compliance

Senior Cloud Infrastructure Engineer for HealthMark Group focusing on building and managing cloud infrastructure solutions. Ensure performance, uptime, and security for cloud native solutions.

AWS

Cloud

DNS

Docker

EC2

Kubernetes

Linux

SMTP

TCP/IP

Terraform

🕒 April 24

EDB

501 - 1000

🏢 Enterprise

🤝 B2B

Senior IT Infrastructure Engineer creating cloud solutions for EDB while managing AWS environments and enhancing team skills.

Ansible

AWS

Cloud

Linux

Python

Terraform

🕒 April 24

Leidos

10,000+ employees

🔒 Cybersecurity

🔬 Science

Oracle Cloud Infrastructure Manager leading design of scalable cloud-native solutions across multiple cloud platforms. Collaborating with teams while ensuring compliance and security in a global-scale environment.

Ansible

Azure

Cloud

Kubernetes

Linux

Oracle

Terraform

🕒 April 21

Iterable

501 - 1000

🤖 Artificial Intelligence

🤝 B2B

Senior Infrastructure Engineer working on User Data Infrastructure team at Iterable. Utilizing Kubernetes and AWS expertise for evolving infrastructure and improving automation.

AWS

Cloud

ElasticSearch

Java

Kubernetes

Python

Terraform

🕒 April 21

Game Plan Tech

51 - 200

🤖 Artificial Intelligence

🏛️ Government

🔒 Cybersecurity

Cloud Infrastructure Engineer responsible for designing and managing GCP infrastructure. Ensuring scalability, reliability, and security of systems in a federal context.

🇺🇸 United States – Remote

💰 $550k Series B - GamePlan Technologies on 2013-10

⏰ Full Time

🟡 Mid-level

🟠 Senior

👷 Infrastructure Engineer

Ansible

Cloud

Google Cloud Platform

Kubernetes

Python

Terraform