
Finance • Artificial Intelligence
The Voleon Group is a company focused on the development and application of advanced machine learning technologies for investment management. By leveraging statistical algorithms and data-driven techniques, Voleon aims to improve financial prediction and management practices. Established in 2007 and headquartered near the University of California, Berkeley, the company benefits from a strong academic environment. Voleon's team consists of top talents in statistics, computer science, and related fields, fostering innovation in a collaborative work culture. The leadership consists of highly educated individuals with a background in computer science and statistics, emphasizing scalability and risk management in their investment strategies.
October 1
🏄 California – Remote
💵 $205k - $235k / year
⏰ Full Time
🟠 Senior
⛑ DevOps & Site Reliability Engineer (SRE)

Finance • Artificial Intelligence
The Voleon Group is a company focused on the development and application of advanced machine learning technologies for investment management. By leveraging statistical algorithms and data-driven techniques, Voleon aims to improve financial prediction and management practices. Established in 2007 and headquartered near the University of California, Berkeley, the company benefits from a strong academic environment. Voleon's team consists of top talents in statistics, computer science, and related fields, fostering innovation in a collaborative work culture. The leadership consists of highly educated individuals with a background in computer science and statistics, emphasizing scalability and risk management in their investment strategies.
• Help scale research compute cluster to meet growing needs. • Leverage engineering skills to ensure high degrees of uptime, reliability, and robustness. • Responsible for keeping research clusters available and performant. • Provide a world-class HPC platform for researchers focusing on machine learning problems at scale. • Support both on-prem and cloud infrastructure, ensuring best experiences for technical staff. • Collaborate with engineering teams to develop monitoring and telemetry improvements. • Design and oversee operational frameworks to ensure cluster operations meet SLAs.
• 5+ years of experience in SRE or DevOps roles, preferably working as a senior engineer or tech lead. • Knowledge of HPC/batch compute frameworks (Slurm, Kueue, AWS/GCP Batch) and/or machine learning training systems (Kubeflow, MLflow, Horovod). • Ability to develop scripts and utilities of moderate complexity in a common scripting language (Python, Ruby, etc.) • Familiarity with infrastructure-as-code and configuration management tools (Terraform, Ansible). • Experience with cloud infrastructure (AWS or GCP). • Familiarity designing and implementing modern observability stacks (Prometheus, Grafana, Loki, ELK, OpenTelemetry). • Experience with distributed storage technologies (Lustre, Ceph, S3). • Embodies a "system engineer" rather than "system administrator" mindset, thinking systematically and leveraging automation. • Bachelor degree in computer science or equivalent experience.
• medical, dental and vision coverage • life and AD&D insurance • 20 days of paid time off • 9 sick days • 401(k) plan with a company match • “Friends of Voleon” Candidate Referral Program
Apply NowOctober 1
Senior DevOps Engineer at Domyn managing cloud and on-prem infrastructure for enterprise AI. Optimize deployments across GCP, Azure, AWS and ensure security, reliability, and high availability.
AWS
Azure
Cloud
Docker
Google Cloud Platform
Java
JavaScript
Kubernetes
Linux
Postgres
Python
Terraform
September 30
Talent-pool for DevOps-specialist roles at Mission Box Solutions. Connecting veteran-owned recruiting agency candidates with hiring companies across DevOps specializations.
September 30
DevOps Engineer building and operating application servers and IaC for Cutsforth's power-generation monitoring systems. Supports customers, deployments, cybersecurity, and LabVIEW-integrated solutions.
🇺🇸 United States – Remote
💵 $103k - $148k / year
⏰ Full Time
🟡 Mid-level
🟠 Senior
⛑ DevOps & Site Reliability Engineer (SRE)
Cloud
Cyber Security
Terraform
September 30
Senior DevOps Architect designing scalable, secure AWS/Kubernetes infrastructure and CI/CD for CrowdStrike's AI-native cybersecurity platform.
🇺🇸 United States – Remote
⏰ Full Time
🟠 Senior
⛑ DevOps & Site Reliability Engineer (SRE)
🦅 H1B Visa Sponsor
AWS
Cloud
Cyber Security
Distributed Systems
Google Cloud Platform
Kubernetes
Terraform
September 28
Site Reliability Engineer building and automating Unqork's enterprise low-code platform. Improve reliability through SLOs, monitoring, and automation.
🇺🇸 United States – Remote
💰 Venture Round on 2021-01
⏰ Full Time
🟡 Mid-level
🟠 Senior
⛑ DevOps & Site Reliability Engineer (SRE)
🦅 H1B Visa Sponsor
Ansible
AWS
Azure
Chef
Cloud
Google Cloud Platform
Grafana
JavaScript
Kubernetes
Linux
MongoDB
MySQL
Node.js
Oracle
Postgres
Puppet
Python
SaltStack
Splunk
Terraform
Go