Senior Engineer – Site Reliability

October 7

Apply Now
Logo of CrowdStrike

CrowdStrike

Cybersecurity • SaaS • Artificial Intelligence

CrowdStrike is a cybersecurity company that provides cloud-based security services to stop breaches. It is recognized as a leader in endpoint protection, identity and cloud security, and managed detection and response. CrowdStrike's platform, Falcon, integrates artificial intelligence to offer real-time visibility, detection, and protection against sophisticated cyber threats. The company is lauded for its effectiveness in securing networks and data, making it a trusted partner for businesses worldwide.

5001 - 10000 employees

Founded 2011

🔒 Cybersecurity

☁️ SaaS

🤖 Artificial Intelligence

📋 Description

• Expertise with Linux engineering and administration for thousands of bare metal servers and virtual machines • Responsible for troubleshooting server hardware issues • Responsible for all operational aspects of our platform - Availability, Latency, Throughput, Monitoring, Issue Response (analysis, remediation, deployment) and Capacity Planning with respect to Latency and Throughput • Work in a team of highly motivated engineers distributed across the globe • Use your passion for technology, automation, and tooling to ensure our platform operates 24x7 • Obsess about learning, and champion the newest technologies & tricks with others, raising the technical IQ of the team. • Have broad exposure to our entire architecture and become one of our experts in our overall process flow • Have an intrinsic drive to make things better • Have experience with modern monitoring and telemetry stacks (ELK, Prometheus, Grafana) • Gather and analyze metrics from both operating systems and applications to assist in performance tuning • Ability to lead incident analysis for incidents, champion incident response practices and assist in correlating incidents to systemic problems, and drive towards resolution.

🎯 Requirements

• Bachelors degree and/or equivalent experience in Computer Science • A minimum of 7 years of experience in software engineering • A minimum of 7 years of experience in one or more of: C++, Java, Python, Go • Experience with storage technologies (Examples: SAN, NAS, NFS, Object Storage, FreeNAS, iSCSI) • Experience with Infrastructure technologies (Examples: Linux, Windows, VMware, Docker, Kubernetes, etc.) • Experience writing technical documentation • Configuration management experience with one or more tools such as Puppet, Chef, Ansible • Solid understanding of application design, including operational trade-offs of various designs • Analytical skills coupled with a strong sense of urgency, ownership, and drive • Ability to work with well in a team-focused environment with other SREs and Engineers • Ability to broadly communicate and present recommended conventions defined by the reliability team broadly

🏖️ Benefits

• Remote-friendly and flexible work culture • Market leader in compensation and equity awards • Comprehensive physical and mental wellness programs • Competitive vacation and holidays for recharge • Paid parental and adoption leaves • Professional development opportunities for all employees regardless of level or role • Employee Networks, geographic neighborhood groups, and volunteer opportunities to build connections • Vibrant office culture with world class amenities • Great Place to Work Certified™ across the globe

Apply Now

Similar Jobs

September 22

Replicated

51 - 200

☁️ SaaS

🏢 Enterprise

🤝 B2B

Senior Customer Reliability Engineer solving Kubernetes and Linux deployment issues for vendors using Replicated's self-hosted application distribution platform. Provide expert support, onboarding, and on-call coverage.

🇺🇸 United States – Remote

💵 $149.5k - $192.5k / year

💰 $50M Series C on 2021-07

⏰ Full Time

🟠 Senior

September 17

CoreSite

201 - 500

Network Reliability Engineer advancing automation, SDN, and cloud interconnection at data center operator CoreSite. Focus on automation, observability, and mentoring engineering teams.

🇺🇸 United States – Remote

💰 $570M Private Equity Round on 2022-10

⏰ Full Time

🟡 Mid-level

🟠 Senior

September 10

Pager Health

201 - 500

⚕️ Healthcare Insurance

☁️ SaaS

🤖 Artificial Intelligence

Tier III Customer Reliability Engineer ensuring Pager Health platform stability and resolving escalated technical incidents. Collaborate with engineering, product, and customer teams.

🇺🇸 United States – Remote

⏰ Full Time

🟡 Mid-level

🟠 Senior

August 30

Horizon3.ai

51 - 200

Design and operate resilient database systems across AWS; automate provisioning, backups, and monitoring while collaborating with security and product teams.

🇺🇸 United States – Remote

⏰ Full Time

🟡 Mid-level

🟠 Senior

August 27

Zealogics Inc

501 - 1000

🏢 Enterprise

☁️ SaaS

🤖 Artificial Intelligence

Lead DevSecOps for Technical Product Management, managing platform configuration, support, and audits. Coach support teams, coordinate upgrades, and enforce DevOps security practices across enterprise applications.

🇺🇸 United States – Remote

⏰ Full Time

🟡 Mid-level

🟠 Senior

Developed by Lior Neu-ner. I'd love to hear your feedback — Get in touch via DM or support@remoterocketship.com