Site Reliability Engineer – Level 3

Emploi pas sur LinkedIn

🕒 il y a 1 mois

🇺🇸 États-Unis – Télétravail

⏰ Temps Plein

🟡 Intermédiaire

🟠 Senior

⛑ Ingénieur DevOps & SRE

🦅 Parrain de Visa H1B

info

🗣️🇺🇸🇬🇧 Anglais requis

Postuler Maintenant
Trouver des Emplois à Distance Similaires

📊 Vérifiez votre score de CV pour ce poste

Améliorez vos chances d'obtenir un entretien en vérifiant votre score de CV avant de postuler.

Logo of Granicus

Granicus

501 - 1000 employés

Fondée en 1999

🏛️ Gouvernement

☁️ SaaS

📋 Conformité

Government • SaaS • Compliance

Granicus est une entreprise technologique axée sur le secteur public qui propose un Government Experience Cloud et une gamme de services numériques pour les agences locales, étatiques, fédérales, éducatives et des districts spéciaux. Ses produits incluent des plateformes d'engagement et de communication, des services et clouds opérationnels (pour les permis, licences, dossiers, demandes de services 311), la gestion des réunions et des agendas, des sites web/CMS, des outils de conformité, ainsi qu'un Agent AI dédié à l'Expérience Gouvernementale pour offrir un service en libre-service 24h/24 et 7j/7. Granicus aide les organisations du secteur public à moderniser la prestation de services, accroître l'engagement citoyen, automatiser les flux de travail et améliorer l'efficacité opérationnelle.

Description

• Provide production support on a shift according to the team on-call roster • Work on the customer and internal engineering/implementation team raised tickets while not on-call for production support • Monitor and Maintain Systems: Continuously monitor the health and performance of our services, systems, and infrastructure • Automate Processes: Develop and maintain automation scripts and tools to streamline operations and reduce manual intervention • Incident Management: Assist in troubleshooting and resolving incidents, performing root cause analysis, and implementing long-term fixes to prevent recurrence • System Improvements: Participate in designing and implementing system improvements to enhance reliability, scalability, and performance • Collaboration: Work closely with software engineers to understand application requirements, provide feedback on design and architecture, and support deployment and release processes • Documentation: Create and maintain documentation for processes, procedures, and troubleshooting guides to ensure knowledge sharing within the team • Capacity Planning: Assist in capacity planning activities to anticipate future needs and ensure that our infrastructure can handle growth • Security: Implement and adhere to security best practices to protect our systems and data

🎯 Exigences

• 5+ years of experience in site reliability engineering, system administration, or a similar role • Good understanding of Linux/Unix systems, networking, and cloud services (AWS, Azure, or Google Cloud) • Experience with scripting languages such as Python, Bash, or Ruby • Bachelor's or postgraduate degree in computer science, Information Technology, or a related field, or equivalent practical experience • Familiarity with AI/ML operations, including model lifecycle management, vector databases, and inference performance tuning • Expertise in Linux/Unix systems, networking, and cloud services (AWS, Azure, or Google Cloud) • Proficiency in scripting languages (Python, Bash, Ruby) and programming languages (Go, Java, C++) • Advanced knowledge of monitoring and logging tools like Elastic (Prometheus, Grafana, Splunk), configuration management (Ansible, Chef, Puppet), and CI/CD pipelines • Strong analytical and problem-solving skills with the ability to diagnose and resolve complex issues efficiently • Excellent verbal and written communication skills, with the ability to convey complex technical concepts to non-technical stakeholders • Demonstrated ability to lead and mentor a team, drive projects to completion, and manage cross-functional initiatives • Relevant certifications such as AWS Certified DevOps Engineer, AWS Certified Machine Learning – Specialty, Google Cloud Professional DevOps Engineer, or similar are a plus.

🏖️ Avantages

• Health insurance • 401(k) matching • Flexible work hours • Paid time off • Remote work options

Postuler Maintenant

Emplois Similaires

🕒 il y a 1 mois

FICO

1001 - 5000

💸 Finance

🤖 Intelligence artificielle

☁️ SaaS

DevOps Engineer at FICO focusing on secure cloud solutions and Kubernetes expertise. Collaborating with engineering teams to drive reliable and scalable software delivery.

🇺🇸 États-Unis – Télétravail

💵 $101 500 - $159 500 / an

⏰ Temps Plein

🟡 Intermédiaire

🟠 Senior

⛑ Ingénieur DevOps & SRE

🗣️🇺🇸🇬🇧 Anglais requis

🕒 il y a 1 mois

Senior DevOps Engineer designing, deploying, and scaling platforms with Kubernetes for aviation systems. Working in a fully remote, international team with a modern cloud-native technology stack.

🗣️🇺🇸🇬🇧 Anglais requis

🕒 il y a 1 mois

ImagineX

201 - 500

🤖 Intelligence artificielle

🔒 Cybersecurity

🏢 Entreprise

Senior Azure DevOps Engineer at ImagineX deploying Azure infrastructure and CI/CD pipelines. Collaborating with teams for secure and scalable solutions in a remote environment.

🇺🇸 États-Unis – Télétravail

💰 Private equity en 2023-11

⏰ Temps Plein

🟠 Senior

⛑ Ingénieur DevOps & SRE

🗣️🇺🇸🇬🇧 Anglais requis

🕒 il y a 1 mois

AceHack 4.0

11 - 50

⚡ Productivité

☁️ SaaS

Site Reliability Engineer at Orkes solving distributed systems challenges and managing cloud infrastructure. Engaging in incident management and improving system reliability through observability tools.

🇺🇸 États-Unis – Télétravail

💵 $180 000 - $250 000 / an

⏰ Temps Plein

🟡 Intermédiaire

🟠 Senior

⛑ Ingénieur DevOps & SRE

🗣️🇺🇸🇬🇧 Anglais requis

🕒 il y a 1 mois

NVIDIA

10 000+ employés

🤖 Intelligence artificielle

🎮 Jeux vidéo

Senior Network Reliability Engineer maintaining NVIDIA's cloud and datacenter networks. Engaging in global support and driving operational improvements across teams.

🗣️🇺🇸🇬🇧 Anglais requis