
Artificial Intelligence • Healthcare Insurance • Government
LMI is a forward-thinking company that focuses on reimagining the path from insight to outcome through innovative solutions in various sectors, including applied AI and digital health. They provide advanced analytics, engineering support, and performance optimization across defense, health, and civilian markets, with a strong commitment to enhancing mission effectiveness for government clients. With a focus on collaboration and research, LMI aims to drive positive change through its diverse capabilities and partnerships.
6 hours ago
🇺🇸 United States – Remote
💵 $140k - $170k / year
⏰ Full Time
🟡 Mid-level
🟠 Senior
⛑ DevOps & Site Reliability Engineer (SRE)
🦅 H1B Visa Sponsor

Artificial Intelligence • Healthcare Insurance • Government
LMI is a forward-thinking company that focuses on reimagining the path from insight to outcome through innovative solutions in various sectors, including applied AI and digital health. They provide advanced analytics, engineering support, and performance optimization across defense, health, and civilian markets, with a strong commitment to enhancing mission effectiveness for government clients. With a focus on collaboration and research, LMI aims to drive positive change through its diverse capabilities and partnerships.
• Monitor the health, performance, and availability of H2FMS applications, services, APIs, and data services in Army GovCloud. • Troubleshoot system issues across application, data, and infrastructure layers. • Implement reliability patterns such as redundancy, graceful degradation, and failover strategies. • Support performance optimization activities based on monitoring metrics and trends. • Manage user access controls, role-based permissions, and environment access configurations. • Maintain, monitor, and archive system logs, audit logs, and access logs to support RMF and cATO requirements. • Support ISSO and Cybersecurity teams in log retrieval, incident investigations, and audit preparation. • Develop and maintain automation scripts to improve environment stability, operational workflows, and deployment reliability. • Collaborate with DevSecOps engineers to integrate automated runtime checks, monitoring, and health checks within CI/CD pipelines. • Assist in implementing automated scaling, alerting, and self-healing mechanisms. • Participate in incident response activities, including detection, diagnosis, escalation, mitigation, and documentation. • Coordinate with cybersecurity teams during security events or anomalies. • Conduct root-cause analysis and contribute to long-term corrective actions. • Maintain environment configuration inventories related to access, logging, monitoring, and deployment parameters. • Support configuration management, patch activities, and version control for infrastructure and application components. • Collaborate with the Cloud Architect on environment design updates and capacity planning. • Document system configurations, access processes, log retention procedures, and environment health dashboards. • Support the ISSM and ISSO teams in continuous monitoring package updates and RMF documentation. • Maintain audit-ready artifacts related to reliability operations and environment management.
• Bachelor’s degree in information technology, Computer Science, Engineering, Cybersecurity, or a related field. • 3–6 years of experience in cloud operations, SRE, DevOps, or system administration roles. • Hands-on experience with cloud monitoring, logging, and performance management tools (AWS CloudWatch, Azure Monitor, ELK/Splunk, Prometheus/Grafana, etc.). • Experience with automation tools (Python, Bash, Terraform, Ansible, etc.). • Familiarity with RMF, Zero Trust, and DoW cloud security requirements. • Understanding of CI/CD pipelines and deployment processes. • Ability to obtain and maintain a DoD Secret clearance.
• Health insurance • Work-Life Wellness • Career Development
Apply Now6 hours ago
Site Reliability Engineer managing production-critical infrastructure and data pipelines at a leading AI investment management firm. Collaborating with experts to improve system reliability and operational efficiency.
🇺🇸 United States – Remote
💵 $115k - $135k / year
⏰ Full Time
🟢 Junior
🟡 Mid-level
⛑ DevOps & Site Reliability Engineer (SRE)
Linux
Python
SQL
12 hours ago
Senior Site Reliability Engineer at BrightHire responsible for end-to-end reliability of critical systems. Focusing on infrastructure improvements and collaboration with Product and Engineering teams.
🇺🇸 United States – Remote
💰 $20.5M Series B on 2021-10
⏰ Full Time
🟠 Senior
⛑ DevOps & Site Reliability Engineer (SRE)
🦅 H1B Visa Sponsor
ElasticSearch
Grafana
Kubernetes
Prometheus
Python
SQL
14 hours ago
Senior Probabilistic Risk and Reliability Engineer at GE Vernova focusing on developing risk assessment technologies and methodologies for nuclear plants. Collaborating with multidisciplinary teams to enhance safety and operational reliability.
🇺🇸 United States – Remote
💵 $111.2k - $213.2k / year
⏰ Full Time
🟠 Senior
⛑ DevOps & Site Reliability Engineer (SRE)
15 hours ago
Lead Site Reliability Engineer managing GCP infrastructure for Health Catalyst. Collaborate across teams to improve system reliability and automate processes.
🇺🇸 United States – Remote
⏰ Full Time
🟠 Senior
⛑ DevOps & Site Reliability Engineer (SRE)
🦅 H1B Visa Sponsor
Cloud
Google Cloud Platform
Jenkins
Kubernetes
Python
Yesterday
DevOps Engineer with cloud infrastructure responsibilities at SmithRx, a Health-Tech company. Join a mission-driven team dedicated to cost-effective pharmacy solutions with innovative technology.
🇺🇸 United States – Remote
💰 $20M Series B on 2022-03
⏰ Full Time
🟠 Senior
⛑ DevOps & Site Reliability Engineer (SRE)
Amazon Redshift
AWS
BigQuery
Cloud
Groovy
Kubernetes
NoSQL
Perl
Postgres
Python
Redis
Ruby
SQL
Terraform
Go