Site Reliability Engineer

March 30

Apply Now
Abarca Health logo

Abarca Health

A different kind of PBM on a mission to make healthcare seamless & personalized for everyone. #PBMAwesome

Pharmacy Benefit Management (PBM) • Health Business Intelligence • Health Information Technology (HIT) • Medicare • Medicaid

501 - 1000

Description

• Our Site Reliability Engineering team leverages software engineering and infrastructure operations to create highly reliable and scalable software systems. The team is responsible for ensuring that Abarca’s infrastructure operates efficiently by assisting with the design, build, and maintenance of software systems that automate and optimize the deployment, monitoring, and performance of Abarca’s systems. By focusing on improving the reliability and availability of software systems through engineering best practices and tools, we manage complex distributed systems to meet our external Service Level Agreements and internal Operating Level Agreements. • As our Site Reliability Engineer, you will be responsible for collaborating on the design, build, and maintenance of reliable and scalable infrastructure and software systems. This will be accomplished by tracking error budgets against service level agreements in order to meet and maintain compliance. You will also be collaborating with our Infrastructure, Software Engineering and Security teams to identify and implement reliability and performance improvements across our systems.

Requirements

• Bachelor’s or Master’s Degree in Information Technology, Computer Science or a related field. (In lieu of a degree equivalent experience may be considered) • 3+ years of experience as a site reliability engineer or within related areas. • Experience managing error budgets as well as service level agreements. • Experience programming with, but not limited to: .Net, C#, JavaScript, PyScript, T-SQL/SQL. • Experience with containerization technologies (e.g. Docker and Kubernetes). • Experience with cloud infrastructure platforms (e.g. AWS, Azure, or GCP). • Experience with monitoring and alerting tools (e.g. DataDog, AppDynamics, Dynatrace, Prometheus, SolarWinds, Grafana, or Nagios) • Participate in on-call rotation to provide 24/7 support for critical systems. Availability to work rotating or irregular shifts, including weekends and certain holidays, per business or operational needs. • Some travel required to Puerto Rico location 15-20%. • Excellent oral and written communication skills. • Experience with automation tools (e.g. Ansible, PowerShell scripting). • Certified SRE Foundation (SREF).

Benefits

• Manage error budgets while ensuring that service level agreements are being met while keeping our stakeholders satisfied and reducing penalties associated with performance issues. • Monitor systems for potential performance and reliability issues, proactively taking measures to prevent their occurrence and minimize service disruption. • Promptly troubleshoot and resolve production issues while also identifying opportunities for improvement in terms of reliability, to ensure timely resolution and mitigate future occurrences. • Collaborate with Software Development, among other teams, continuously improving systems and processes to increase efficiency, minimize downtime, and optimize overall system reliability. • Develop and maintain automation tools to improve system observability, reliability, and performance. • Design and implement disaster recovery plans to ensure business continuity.

Apply Now
Built by Lior Neu-ner. I'd love to hear your feedback — Get in touch via DM or lior@remoterocketship.com
Jobs by Title
Remote Account Executive jobsRemote Accounting, Payroll & Financial Planning jobsRemote Administration jobsRemote Android Engineer jobsRemote Backend Engineer jobsRemote Business Operations & Strategy jobsRemote Chief of Staff jobsRemote Compliance jobsRemote Content Marketing jobsRemote Content Writer jobsRemote Copywriter jobsRemote Customer Success jobsRemote Customer Support jobsRemote Data Analyst jobsRemote Data Engineer jobsRemote Data Scientist jobsRemote DevOps jobsRemote Ecommerce jobsRemote Engineering Manager jobsRemote Executive Assistant jobsRemote Full-stack Engineer jobsRemote Frontend Engineer jobsRemote Game Engineer jobsRemote Graphics Designer jobsRemote Growth Marketing jobsRemote Hardware Engineer jobsRemote Human Resources jobsRemote iOS Engineer jobsRemote Infrastructure Engineer jobsRemote IT Support jobsRemote Legal jobsRemote Machine Learning Engineer jobsRemote Marketing jobsRemote Operations jobsRemote Performance Marketing jobsRemote Product Analyst jobsRemote Product Designer jobsRemote Product Manager jobsRemote Project & Program Management jobsRemote Product Marketing jobsRemote QA Engineer jobsRemote SDET jobsRemote Recruitment jobsRemote Risk jobsRemote Sales jobsRemote Scrum Master + Agile Coach jobsRemote Security Engineer jobsRemote SEO Marketing jobsRemote Social Media & Community jobsRemote Software Engineer jobsRemote Solutions Engineer jobsRemote Support Engineer jobsRemote Technical Writer jobsRemote Technical Product Manager jobsRemote User Researcher jobs