Senior Site Reliability Engineer

Artificial Intelligence • Enterprise • SaaS

ScienceLogic is a leading provider of AI-powered IT operations management solutions designed to improve IT efficiency and business visibility. Their AI platform, SL1, offers comprehensive solutions for monitoring, assuring, automating, and providing insights into hybrid IT infrastructure. ScienceLogic aims to empower businesses by enabling autonomic IT and workflow automation, significantly reducing mean time to repair (MTTR) and driving digital transformation. By consolidating IT tools and providing automated root cause analysis, ScienceLogic enhances problem-solving capabilities and ensures real-time management of IT environments at scale. Their platform is trusted by top-tier organizations across multiple industries, including government and public sector, banking, and financial services.

501 - 1000 employees

Founded 2010

🤖 Artificial Intelligence

🏢 Enterprise

☁️ SaaS

💰 $21.2M Venture Round on 2022-10

Senior Site Reliability Engineer

November 18

⚔️ Virginia – Remote

⏰ Full Time

🟠 Senior

⛑ DevOps & Site Reliability Engineer (SRE)

🦅 H1B Visa Sponsor

AWS

Cloud

Kubernetes

Linux

Perl

Python

Terraform

Apply Now

ScienceLogic

Artificial Intelligence • Enterprise • SaaS

501 - 1000 employees

Founded 2010

🤖 Artificial Intelligence

🏢 Enterprise

☁️ SaaS

💰 $21.2M Venture Round on 2022-10

📋 Description

• Lead design reviews and buildout of secure systems for delivering new Artificial Intelligence Product in SaaS, aiming for 99.99% uptime. • Design, automate, test, and monitor the use of cloud native technologies as a foundation for a service platform. • Spend 75% of your time on forward looking priorities designing and building SaaS systems while remaining on supporting the Operations and Maintenance of the current SaaS infrastructure. • Investigate and resolve customer and operational issues with the mentality of fixing and not just mitigating issues. • Identify and automate measurement of operations SLAs and SLOs • Triage incident response, document SOPs, Runbooks, and train NOC team members • Writing automation can be easily supported and extended by others. • Collaborate across the organization to design, build and operationalize SaaS services conforming to various security standards like FedRAMP, SOC2, ISO etc. • Participate in the on-call rotation as assigned. • Take full responsibility for the availability and performance of the platform. • Work on special projects as assigned.

🎯 Requirements

• 8-12 years of site reliability engineering, cloud operations or equivalent experience • Proven experience in managing complex Kubernetes environments in multiple Production systems. • Working with Cloud Automation tools like CloudFormation, Terraform, aws-cli/CDK, Cloudformation • Scripting languages like Python, Bash, Perl etc. • Exposure to Linux administration skills. • Proven track record of operating production SaaS environments within security standards like FedRAMP, SOC2, ISO, PCI. • Skilled at problem solving, algorithms, and data structures conforming to the modern SaaS security requirements. • Building tools and scripting frameworks from scratch. • Familiarity with basic networking, security and cloud engineering concepts • Highly collaborative with effective written and verbal communication skills • Ability to work against tight deadlines and occasionally after-hours, part of on-call scheduling. • Occasionally work during off-hours and participate in weekly on-call schedule. • Bachelors or Master's degree in Computer Science, Information Systems or similar field.

🏖️ Benefits

• A remote flexible workplace. • Comprehensive medical, dental and vision plans. • 401(k) plan with employer match. • Flexible Paid Time Off (FTO) so that you can take the time that you need to re-energize. • Volunteer Time Off (VTO) - take two days off per calendar year to volunteer with your preferred charitable organization. • 5-year Service Milestone Sabbatical. • Paid parental leave. • Generous employee referral bonus program. • Pet insurance. • HQ Office centrally located in Reston Town Center featuring a well-stocked kitchen with rotating snacks and beverages, and catered lunch on Thursdays. • Regular virtual company-wide events, including cooking classes, yoga, meditation and more. • The opportunity to learn and develop from some of the best and brightest minds in the industry!

Apply Now

Similar Jobs

Senior DevOps Engineer

November 18

Sureify

201 - 500

💳 Fintech

☁️ SaaS

Senior DevOps Engineer scaling cloud-native infrastructure at Sureify's innovative life insurance platform. Engage in building and maintaining AWS infrastructure, improving CI/CD pipelines, and mentoring teams.

🇺🇸 United States – Remote

💵 $140k - $175k / year

⏰ Full Time

🟠 Senior

⛑ DevOps & Site Reliability Engineer (SRE)

🦅 H1B Visa Sponsor

AWS

Cloud

DNS

Docker

EC2

Grafana

Kubernetes

Prometheus

Python

Terraform

Lead DevOps Engineer

November 18

Updater

201 - 500

👥 B2C

🏢 Enterprise

🏪 Marketplace

Lead DevOps Engineer at Updater enhancing deployment velocity and system reliability through collaboration and tooling. Focus on improving developer experience and driving key initiatives in Platform Engineering.

🇺🇸 United States – Remote

💵 $180k - $215k / year

💰 $215M Debt Financing on 2022-05

⏰ Full Time

🟠 Senior

⛑ DevOps & Site Reliability Engineer (SRE)

🦅 H1B Visa Sponsor

AWS

Flux

Kubernetes

Prometheus

Terraform

Site Reliability Engineer – India Based, Bangalore, Hyderabad

November 17

Zimperium

201 - 500

🔒 Cybersecurity

🏢 Enterprise

☁️ SaaS

Senior Site Reliability Engineer responsible for optimizing and automating production applications and systems at a mobile security company. Lead engineering projects in a collaborative and supportive environment.

🇺🇸 United States – Remote

💰 $12M Venture Round on 2018-11

⏰ Full Time

🟡 Mid-level

🟠 Senior

⛑ DevOps & Site Reliability Engineer (SRE)

🦅 H1B Visa Sponsor

Cloud

Docker

Java

Kubernetes

Linux

Python

Unix

Systems Engineer – Cloud DevOps Tech Support

November 15

TalentRemedy

11 - 50

🎯 Recruiter

🤝 B2B

Systems Engineer providing Cloud DevOps technical support for the Federal Aviation Administration. Responsible for maintaining cloud infrastructure and troubleshooting technical issues.

🇺🇸 United States – Remote

⏰ Full Time

🟡 Mid-level

🟠 Senior

⛑ DevOps & Site Reliability Engineer (SRE)

AWS

Azure

Cloud

Senior DevOps Engineer

November 15

The Linux Foundation

201 - 500

🤝 Non-profit

📚 Education

🏢 Enterprise

Senior DevOps Engineer ensuring reliability, automation, and observability of open data pipelines at Overture Maps Foundation. Managing large-scale geospatial data across cloud environments and CI/CD workflows.

🇺🇸 United States – Remote

💵 $155k - $175k / year

⏰ Full Time

🟠 Senior

⛑ DevOps & Site Reliability Engineer (SRE)

🦅 H1B Visa Sponsor

Airflow

AWS

Azure

Cloud

Python

Scala

Spark