
Cybersecurity • SaaS • Artificial Intelligence
CrowdStrike is a cybersecurity company that provides cloud-based security services to stop breaches. It is recognized as a leader in endpoint protection, identity and cloud security, and managed detection and response. CrowdStrike's platform, Falcon, integrates artificial intelligence to offer real-time visibility, detection, and protection against sophisticated cyber threats. The company is lauded for its effectiveness in securing networks and data, making it a trusted partner for businesses worldwide.
5001 - 10000 employees
Founded 2011
🔒 Cybersecurity
☁️ SaaS
🤖 Artificial Intelligence
Yesterday
Ansible
AWS
Chef
Cloud
Cyber Security
Distributed Systems
Docker
Google Cloud Platform
Grafana
Kubernetes
MySQL
Prometheus
Puppet
Redis
Terraform
Go

Cybersecurity • SaaS • Artificial Intelligence
CrowdStrike is a cybersecurity company that provides cloud-based security services to stop breaches. It is recognized as a leader in endpoint protection, identity and cloud security, and managed detection and response. CrowdStrike's platform, Falcon, integrates artificial intelligence to offer real-time visibility, detection, and protection against sophisticated cyber threats. The company is lauded for its effectiveness in securing networks and data, making it a trusted partner for businesses worldwide.
5001 - 10000 employees
Founded 2011
🔒 Cybersecurity
☁️ SaaS
🤖 Artificial Intelligence
• Ensure Platform Reliability: Own the availability, latency, performance, and efficiency of NG-SIEM platform services handling >100 PB/day of data ingestion and millions of queries per hour • Build Automation & Tooling: Design and implement automation solutions for deployment, monitoring, incident response, and capacity planning to reduce toil and improve operational efficiency • Monitor & Optimize: Develop comprehensive observability solutions using metrics, logs, and traces; proactively identify and resolve performance bottlenecks and reliability issues • Incident Management: Lead incident response efforts, conduct blameless post-mortems, and drive continuous improvement initiatives to prevent recurrence • Capacity Planning: Analyze system performance data and growth trends to forecast infrastructure needs and ensure the platform scales efficiently with customer demand • SLO/SLA Management: Define, measure, and maintain Service Level Objectives and error budgets; balance feature velocity with reliability requirements • Cost Optimization: Implement strategies to optimize cloud resource utilization and reduce operational costs while maintaining performance and reliability standards • Collaborate Cross-Functionally: Partner with engineering teams to improve system design for reliability, influence architectural decisions, and embed SRE best practices • On-Call Participation: Participate in on-call rotation to provide 24/7 support for critical production systems • Documentation: Create and maintain runbooks, operational procedures, and technical documentation to enable team scalability
• Experience in Site Reliability Engineering, DevOps, or similar roles supporting large-scale distributed systems in production environments • Strong programming skills in at least one language (Go) for automation and tooling development • Deep cloud expertise with hands-on experience in at least one major cloud platform (AWS or GCP), including compute, storage, networking, and managed services • Distributed systems knowledge: Understanding of distributed system design patterns, consistency models, fault tolerance, and scalability principles • Infrastructure as Code: Proficiency with IaC tools (Terraform) and configuration management (Ansible, Chef, Puppet) • Container orchestration: Experience with Kubernetes, Docker, Podman and container-based deployment patterns • Observability expertise: Hands-on experience with monitoring and observability tools (Prometheus, Grafana) • CI/CD pipelines: Experience building and maintaining continuous integration and deployment pipelines • Incident management: Proven track record of managing high-severity incidents and implementing preventive measures • Data-driven approach: Ability to analyze system metrics and logs to identify trends, anomalies, and optimization opportunities • Communication skills: Excellent verbal and written communication abilities for remote collaboration across global teams • Bonus Points: Massive scale experience: 3+ years owning systems handling over 1 trillion requests per day or more than 10 PB of data per day • Multi-cloud experience: Hands-on work with hybrid or multi-cloud environments • Database expertise: Deep knowledge of distributed databases, data lakes, or SIEM platforms (ClickHouse, Redis, MySQL) • Security background: Exposure to cybersecurity, threat intelligence, or security operations • Networking expertise: Advanced understanding of network protocols, load balancing, and CDN technologies
• Remote-friendly and flexible work culture • Market leader in compensation and equity awards • Comprehensive physical and mental wellness programs • Competitive vacation and holidays for recharge • Paid parental and adoption leaves • Professional development opportunities for all employees regardless of level or role • Employee Networks, geographic neighborhood groups, and volunteer opportunities to build connections • Vibrant office culture with world class amenities • Great Place to Work Certified™ across the globe
Apply Now2 days ago
Senior Reliability Engineer at Landbot optimizing cloud resources and building internal developer tools. Collaborating with application teams to enhance platform reliability and developer experience.
🇪🇸 Spain – Remote
💰 $8M Series A on 2021-01
⏰ Full Time
🟠 Senior
⛑ DevOps & Site Reliability Engineer (SRE)
🗣️🇪🇸 Spanish Required
3 days ago
11 - 50
Senior DevOps Engineer delivering impactful solutions for life-changing technology with Peek Vision. Join an award-winning team improving access to eye care for underserved communities.
3 days ago
🗣️🇪🇸 Spanish Required
November 21
1001 - 5000
Release Engineer joining SUSE to manage software-defined infrastructures at scale, working with an international team to innovate solutions. Contributing to releases and coordinating across teams.
November 21
IT Deployment Engineer providing first and second line support in the Eurofins diagnostics network. Training users and supporting the Laboratory Information Management System.
🇪🇸 Spain – Remote
💰 $30M Grant on 2021-10
⏰ Full Time
🟡 Mid-level
🟠 Senior
⛑ DevOps & Site Reliability Engineer (SRE)