
DataCrunch.io is a fresh cloud service provider, our main focus is providing our own infrastructure for machine learning.
11 - 50 employees
💰 Pre Seed Round on 2021-11
October 29

DataCrunch.io is a fresh cloud service provider, our main focus is providing our own infrastructure for machine learning.
11 - 50 employees
💰 Pre Seed Round on 2021-11
• Ensure the reliability, scalability, and performance of HPC and cloud systems. • Build and maintain automation, observability, and monitoring frameworks for compute clusters. • Collaborate with ML, data, and infrastructure teams to deliver high-availability systems. • Develop and enhance CI/CD pipelines, deployment workflows, and on-call processes. • Participate in architecture design and long-term infrastructure strategy discussions. • Help establish local infrastructure and contribute to the setup of our future San Francisco office. • Play a key role in recruiting and mentoring as our U.S. team grows.
• 7+ years in SRE, DevOps, or Infrastructure Engineering—preferably in HPC or large-scale distributed systems. • Linux expertise (Ubuntu or Debian preferred). • Strong experience with scripting and automation (Python, Go, Bash). • Proven ability with cloud platforms (AWS, GCP, Azure, or modern HPC providers such as CoreWeave, Lambda, Nebius). • Deep understanding networking (DNS/TCP), and infrastructure-as-code tools (Terraform, Ansible). • Experience managing Slurm-based HPC GPU clusters, diagnosing performance issues, and designing efficient HPC jobs. • Familiarity with ML model training environments. • Understanding of Kubernetes (nice to have)
• Generous cash + equity compensation • Various fringe benefits (e.g., healthcare, lunch, wellbeing, etc.)
Apply NowOctober 29
Senior DevOps Engineer focused on cloud infrastructure for UserTesting, ensuring systems are fast and reliable. Collaborating with engineers to deliver exceptional developer experiences.
🇺🇸 United States – Remote
💰 Grant on 2020-11
⏰ Full Time
🟠 Senior
⛑ DevOps & Site Reliability Engineer (SRE)
October 29
2 - 10
Site Reliability Engineer developing and maintaining critical features for Stone Tech. Responsible for monitoring performance and ensuring reliability across systems.
🇺🇸 United States – Remote
⏰ Full Time
🟡 Mid-level
🟠 Senior
⛑ DevOps & Site Reliability Engineer (SRE)
🦅 H1B Visa Sponsor
🗣️🇧🇷🇵🇹 Portuguese Required
October 29
DevOps Engineer enhancing and maintaining cloud infrastructure at fast-growing startup Lumos. Collaborates with development and operations teams for automation and scalability.
🇺🇸 United States – Remote
💵 $160k - $190k / year
⏰ Full Time
🟡 Mid-level
🟠 Senior
⛑ DevOps & Site Reliability Engineer (SRE)
🦅 H1B Visa Sponsor
October 29
Site Reliability Engineer ensuring high uptime and performance for cloud systems at Hydra Host. Collaborating with teams to integrate monitoring and QA tools for reliability and observability.
🇺🇸 United States – Remote
💵 $140k - $200k / year
💰 $10M Seed Round on 2022-04
⏰ Full Time
🟡 Mid-level
🟠 Senior
⛑ DevOps & Site Reliability Engineer (SRE)
October 28
Senior Site Reliability Engineer ensuring daily operations and incident handling for large scale GPU platforms at NVIDIA. Contributing to feature design and cluster validation for optimal performance and resilience.
🇺🇸 United States – Remote
💵 $168k - $333.5k / year
⏰ Full Time
🟠 Senior
⛑ DevOps & Site Reliability Engineer (SRE)
🦅 H1B Visa Sponsor