Lead Site Reliability Engineer – Data Platforms

Job not on LinkedIn

October 11

Apply Now
Logo of Coupa Software

Coupa Software

SaaS • Finance • eCommerce

Coupa Software is a leading provider of business spend management solutions. Their platform focuses on optimizing and transforming direct and indirect spend across procurement, finance, supply chain, and IT. Coupa leverages AI and extensive data insights to drive cost efficiencies, manage supplier relationships, and mitigate risks. With products covering areas such as invoicing, payments, expense management, and supply chain collaboration, Coupa serves a wide range of industries including automotive, healthcare, retail, and more. Their comprehensive community and partner ecosystem enable organizations to unlock hidden savings and improve compliance, promoting growth and resilience in a changing economic climate.

1001 - 5000 employees

Founded 2006

☁️ SaaS

💸 Finance

🛍️ eCommerce

📋 Description

• Manage end-to-end data pipelines (ETL), AWS infrastructure (S3, EMR, Redshift), and IaC tools like Terraform and Chef. • Support containerized applications (ECS, Docker) and administer Linux-based systems. • Collaborate with ML teams to manage ML/GenAI infrastructure and deliver AI-driven features. • Enable monitoring and observability across systems and applications. • Provide support for data pipelines, including on-call rotation and release planning with Dev teams. • Participate in design/code reviews, troubleshoot complex issues, and document root cause analyses (RCAs).

🎯 Requirements

• 8+ years of experience in Big Data technologies, data pipelines, and Linux administration; strong scripting skills in Bash or Python. • 5+ years managing cloud platforms (AWS, Azure), including tools like ECS, EKS, AKS, Terraform, and Helm. • Hands-on experience with Infrastructure as Code, CI/CD tools (Chef, Ansible, Jenkins), and source control (Git). • Familiarity with Generative AI tools (SageMaker, Bedrock, Azure ML), vector databases, and a strong interest in AI technologies. • Solid knowledge of networking (DNS, load balancers), MySQL, Apache Spark, and BI/data lake platforms (e.g., Looker). • Strong communication skills, self-driven with global thinking, and capable of independently resolving complex issues and delivering projects.

🏖️ Benefits

• Health insurance • Equal employment opportunities • Welcoming and inclusive work environment

Apply Now

Similar Jobs

October 11

ActioNet, Inc.

1001 - 5000

🤖 Artificial Intelligence

🔒 Cybersecurity

Cloud Engineer / DevOps Engineer designing and optimizing Azure-based environments for NOAA’s Marine Operations. Implementing CI/CD pipelines and managing cloud resources with a focus on security and automation.

🇺🇸 United States – Remote

⏰ Full Time

🟡 Mid-level

🟠 Senior

⛑ DevOps & Site Reliability Engineer (SRE)

October 11

ActioNet, Inc.

1001 - 5000

🤖 Artificial Intelligence

🔒 Cybersecurity

Cloud Engineer/DevOps Engineer designing and optimizing Azure-based data environments for NOAA Marine Operations. Responsible for implementing CI/CD pipelines and managing cloud resources.

🇺🇸 United States – Remote

⏰ Full Time

🟡 Mid-level

🟠 Senior

⛑ DevOps & Site Reliability Engineer (SRE)

October 11

D-Wave

51 - 200

🤖 Artificial Intelligence

🔧 Hardware

Senior Site Reliability Engineer maintaining the reliability of D-Wave's SaaS products and production quantum infrastructure. Collaborating with various teams to ensure system performance and scalability.

🇺🇸 United States – Remote

💵 $124.4k - $185.5k / year

⏰ Full Time

🟠 Senior

⛑ DevOps & Site Reliability Engineer (SRE)

October 10

GovCIO

1001 - 5000

🏛️ Government

🏢 Enterprise

🔒 Cybersecurity

Senior Engineer with DevSecOps experience to provide development and leadership for AWS applications. Job involves architecture and support for a government application being migrated to the AWS cloud.

🇺🇸 United States – Remote

💵 $124.5k - $155k / year

⏰ Full Time

🟠 Senior

⛑ DevOps & Site Reliability Engineer (SRE)

October 10

Nexthink

501 - 1000

☁️ SaaS

🏢 Enterprise

Platform Engineer with SRE operations experience at Nexthink, focusing on building and maintaining infrastructure for their SaaS platform. Join a digital employee experience management software leader.

🇺🇸 United States – Remote

💵 $174k - $272k / year

💰 Series D on 2021-02

⏰ Full Time

🟡 Mid-level

🟠 Senior

⛑ DevOps & Site Reliability Engineer (SRE)

Developed by Lior Neu-ner. I'd love to hear your feedback — Get in touch via DM or support@remoterocketship.com