Senior DevOps / SRE Engineer

51 - 200 employees

💼 Consulting

🤖 Artificial Intelligence

💳 Fintech

Consulting • Artificial Intelligence • Fintech

MLabs is a consultancy firm specializing in Haskell, Rust, Blockchain, and AI technologies. They assist clients in setting up project specifications, implementation, management, and maintenance of technical projects across various sectors. MLabs has expertise in functional programming, compilers, AI, DevOps, and full-stack development, focusing primarily on industries such as Fintech and Information Technology.

Senior DevOps / SRE Engineer

🕒 April 1

🇺🇸 United States – Remote

💵 $120k - $150k / year

⏰ Full Time

🟠 Senior

⛑ DevOps & Site Reliability Engineer (SRE)

Ansible

AWS

Docker

Grafana

JavaScript

Kafka

Kubernetes

Node.js

Postgres

Prometheus

Python

Redis

Terraform

Apply Now

Find Similar Remote Jobs

📊 Check your resume score for this job

Improve your chances of getting an interview by checking your resume score before you apply.

MLabs

51 - 200 employees

💼 Consulting

🤖 Artificial Intelligence

💳 Fintech

Consulting • Artificial Intelligence • Fintech

📋 Description

• Build and maintain the infrastructure for concurrent AI trading agents, managing complex cron schedules, state files, and trailing stop processes. • Deploy and manage agent environments, including workspace persistence, isolated session management, and Model Context Protocol (MCP) server connectivity. • Design and operate pipelines for shipping trading skills and plugins to production without interrupting live trading activity. • Execute deployment strategies (blue/green, canary) ensuring active financial positions remain protected during every infrastructure change. • Build comprehensive alerting across the full stack using metrics, logs, and traces to detect agent failures, state file corruption, or infrastructure regressions before financial loss occurs. • Operate and scale core platform infrastructure, including Kubernetes (EKS) clusters, Redis, Postgres, ClickHouse, and Kafka. • Maintain blockchain node infrastructure and ensure stable connectivity to exchange APIs and on-chain transaction systems. • Lead incident response and on-call practices, including debugging, mitigation, and post-mortems to improve long-term platform reliability.

🎯 Requirements

• Extensive experience in DevOps, SRE, or Infrastructure Engineering, preferably within a startup environment where systems were built from the ground up. • Proven track record of deploying, scaling, and debugging production workloads, specifically within AWS EKS. • Proficiency with tools such as Terraform, Ansible, or equivalent frameworks. • Hands-on experience with Docker and Helm for packaging production services. • Experience operating production-grade data and messaging systems (Redis, Postgres/RDS, ClickHouse, Kafka). • Strong experience with Prometheus, Grafana, Datadog, Loki, or OpenTelemetry to build proactive operational visibility. • Ability to debug across multiple languages, including Python, Node.js, and Go. • Understanding of systems where latency and reliability have direct financial consequences. • Familiarity with node infrastructure, exchange APIs, wallet operations, and on-chain monitoring. • Experience managing secrets, access controls, and production hardening for sensitive financial environments. • Experience defining SLOs and building mature on-call practices.

🏖️ Benefits

• Opportunity to build infrastructure for a new category of software (Autonomous AI Agents). • High-autonomy environment with a focus on engineering excellence and technical ownership. • Competitive compensation package commensurate with senior-level experience. • Remote-first or flexible working arrangements (as specified by the client).

Apply Now

Similar Jobs

Senior Engineer – DevOps

🕒 April 1

SPLICE

1 - 10

💼 Consulting

🎯 Recruiter

📣 Marketing

Senior Engineer - DevOps responsible for supporting clients' software through the lifecycle. Collaborating in a hybrid environment with significant technical guidance and mentoring.

🇺🇸 United States – Remote

⏰ Full Time

🟠 Senior

⛑ DevOps & Site Reliability Engineer (SRE)

🦅 H1B Visa Sponsor

Ansible

Chef

Jenkins

NoSQL

Puppet

RDBMS

Terraform

System Deployment Engineer

🕒 April 1

Software Technology Inc.

201 - 500

💼 Consulting

🏥 Healthcare

✈️ Travel

System Deployment Engineer providing software and hardware deployments for customers. Working closely with Technical Account Experts for customer installation and training.

🇺🇸 United States – Remote

⏰ Full Time

🟡 Mid-level

🟠 Senior

⛑ DevOps & Site Reliability Engineer (SRE)

DNS

Linux

Perl

Python

DevOps Engineer

🕒 April 1

Tactibit Technologies

1 - 10

🔒 Cybersecurity

🏛️ Government

📡 Telecommunications

DevOps Engineer responsible for building and implementing CI/CD pipelines using AWS tools. Joining an innovative team supporting Federal Government projects.

🇺🇸 United States – Remote

⏰ Full Time

🟡 Mid-level

🟠 Senior

⛑ DevOps & Site Reliability Engineer (SRE)

AWS

Cloud

Python

Senior DevOps Engineer – Consultant

🕒 March 31

Liatrio

51 - 200

🏢 Enterprise

☁️ SaaS

💼 Consulting

Senior DevOps Engineer consulting for Liatrio, a boutique firm in engineering delivery and people enablement. Driving DevOps transformations across diverse industries.

🇺🇸 United States – Remote

💵 $110k - $183k / year

⏰ Full Time

🟠 Senior

⛑ DevOps & Site Reliability Engineer (SRE)

Ansible

Azure

Chef

Cloud

ElasticSearch

Grafana

Java

JavaScript

Jenkins

Kubernetes

Prometheus

Puppet

Python

Terraform

.NET

DevOps Engineer

🕒 March 31

Shop Your Way

10,000+ employees

🚘 Automotive

📦 Logistics

💸 Finance

DevOps Engineer managing automation and deployment processes for software development at Shop Your Way. Collaborating to maintain the ShopYourWay platform and applications uptime.

🇺🇸 United States – Remote

⏰ Full Time

🟡 Mid-level

🟠 Senior

⛑ DevOps & Site Reliability Engineer (SRE)

AWS

Cloud

DNS

Firewalls

Grafana

Java

Jenkins

Linux

MySQL

NoSQL

PHP

Postgres

Python

SQL

VMware

.NET