Senior Manager, DevOps

Job not on LinkedIn

🕒 April 21

Apply Now
Find Similar Remote Jobs

📊 Check your resume score for this job

Improve your chances of getting an interview by checking your resume score before you apply.

Logo of TrueML

TrueML

51 - 200 employees

💳 Fintech

💸 Finance

👥 B2C

Fintech • Finance • B2C

TrueML is a leading company in the fintech sector, known for its innovative solutions that prioritize customer experience in the financial services industry. The company, along with its family of companies like TrueAccord, focuses on developing intelligent, digital-first communication platforms and products that revolutionize the consumer experience in financial health management. TrueML leverages the expertise of a dynamic team of data scientists, financial services experts, and customer experience specialists to create technology that addresses roadblocks to consumers' financial well-being, ensuring inclusivity and accessibility in financial systems. Founded in 2013 by Ohad Samet, TrueML continues to disrupt traditional financial services by making them more consumer-friendly and effective.

📋 Description

• Define and execute the long-term strategic vision for Infrastructure as Code (IaC), CI/CD evolution, and cloud-native architecture to support TrueML’s scaling needs. • Lead the design and implementation of self-service internal platforms to reduce developer cognitive load, enabling feature teams to deploy and manage services with minimal friction at increased velocity. • Act as the primary stakeholder for cloud spend (AWS); drive cost-optimization initiatives and lead contract negotiations for the DevOps toolstack and third-party vendors. • Ensure the infrastructure architecture supports strict High Availability (HA) requirements and robust Disaster Recovery (DR) protocols, maintaining system integrity across multiple regions. • Oversee the implementation and evolution of comprehensive monitoring, logging, and distributed tracing systems, leveraging AIOps to move from reactive to predictive system maintenance. • Champion security by design by integrating automated vulnerability scanning, secret management, and compliance checks directly into the automated build pipelines. • Serve as the ultimate escalation point for major production outages, facilitating blameless post-mortem reviews that focus on systemic improvements rather than individual error. • Maintain deep technical currency in container orchestration (Kubernetes), serverless patterns, and modern automation frameworks to provide meaningful mentorship and architectural guidance to senior engineering staff.

🎯 Requirements

• Bachelor's degree in Computer Science, Engineering, or a related technical field, or equivalent practical experience. • 10+ years of experience in DevOps, Site Reliability Engineering (SRE), or Software Engineering; 5+ years of experience managing engineers • Expert-level mastery with AWS and experience managing multi-region, high-availability deployments • Advanced experience with Kubernetes (K8s) and Docker, including cluster management, networking, and scaling in a production environment. • Proficiency in Terraform to drive consistency and automation across all infrastructure layers. Experience with Atlantis is a plus. • Deep experience designing and maintaining complex pipelines (GitHub Actions, GitLab CI, or Jenkins) and mastery of scripting languages like Python, Go, or Bash. • Hands-on experience with modern monitoring, observability, and tracing stacks (Datadog, Observe) and a firm grasp of SRE principles (SLIs/SLOs/Error Budgets). • Experience acting as an Incident Commander for high-severity outages and fostering a "blameless" post-mortem culture. • Demonstrated ability to influence executive leadership and collaborate cross-functionally with Product, Engineering, and Security teams. • Experience integrating AI-assisted productivity tools (Cline, GitHub Copilot) into the engineering workflow to accelerate delivery.

Apply Now

Similar Jobs

🕒 April 21

Sweed POS

11 - 50

🛒 Retail

🛍️ eCommerce

🤝 B2B

DevOps Engineer optimizing infrastructure and implementing automation for Sweed's cannabis retail platform. Collaborate with global teams to enhance development and deployment processes.

🇺🇸 United States – Remote

⏰ Full Time

🟡 Mid-level

🟠 Senior

⛑ DevOps & Site Reliability Engineer (SRE)

🕒 April 21

Cyngn

51 - 200

🚗 Transport

☁️ SaaS

🔧 Hardware

Deployment Engineer optimizing autonomy for Cyngn's autonomous robotic systems deployed across North America. Leading on-site deployments and ensuring customer satisfaction in a diverse team environment.

🇺🇸 United States – Remote

💵 $100k - $125k / year

💰 $20M Post-IPO Equity on 2022-04

⏰ Full Time

🟡 Mid-level

🟠 Senior

⛑ DevOps & Site Reliability Engineer (SRE)

🦅 H1B Visa Sponsor

info

🕒 April 20

URBN (Urban Outfitters, Anthropologie Group, Free People & Nuuly)

10,000+ employees

👥 B2C

🛒 Retail

👗 Fashion

Senior DevOps Engineer optimizing cloud infrastructure on GCP for Nuuly. Leading CI/CD initiatives and collaborating with developers to enhance system performance.

🇺🇸 United States – Remote

⏰ Full Time

🟠 Senior

⛑ DevOps & Site Reliability Engineer (SRE)

🕒 April 20

GitLab

1001 - 5000

🤖 Artificial Intelligence

🏢 Enterprise

☁️ SaaS

Site Reliability Engineer for GitLab focusing on Environment Automation and managing isolated environments. Collaborating with the team to ensure reliability, scalability, and security of services.

🇺🇸 United States – Remote

💵 $103.6k - $222k / year

💰 Secondary Market on 2020-11

⏰ Full Time

🟡 Mid-level

🟠 Senior

⛑ DevOps & Site Reliability Engineer (SRE)

🕒 April 20

Skydio

501 - 1000

🔧 Hardware

🤖 Artificial Intelligence

🔐 Security

Deployment Engineer managing technical implementation and support for Skydio's cutting-edge cloud connected products. Collaborating across internal teams and directly with customers to ensure success.

🇺🇸 United States – Remote

💵 $115k - $135k / year

💰 $170M Series E - Skydio on 2024-11

⏰ Full Time

🟡 Mid-level

🟠 Senior

⛑ DevOps & Site Reliability Engineer (SRE)

🦅 H1B Visa Sponsor

info