SRE Pleno – Tarde/Noite

🕒 May 19

🗣️🇧🇷🇵🇹 Portuguese Required

Apply Now
Find Similar Remote Jobs

📊 Check your resume score for this job

Improve your chances of getting an interview by checking your resume score before you apply.

Logo of Banco ABC Brasil

Banco ABC Brasil

WebsiteLinkedIn

1001 - 5000 employees

Founded 1989

🏦 Banking

💸 Finance

💳 Fintech

Banking • Finance • Fintech

Banco ABC Brasil is a financial institution specialized in providing customized financial solutions to individuals and businesses. With a highly skilled multidisciplinary team, it focuses on promoting growth through long-term relationships based on loyalty, transparency, and results. They offer services such as personal investment banking, corporate financial management, insurance brokerage, and energy market solutions, aiming to help clients maximize their financial outcomes.

📋 Description

• Act as the first-response point (N1/N2) for incident handling in cloud (AWS, Azure, GCP) and on-premises environments, performing triage, severity classification and formal logging following ITIL. • Perform initial incident diagnosis, investigating root causes using logs, metrics and observability events (Zabbix, Grafana, CloudWatch and Dynatrace). • Escalate correctly to N2/N3 when the incident exceeds the level's scope, ensuring accurate handover of information and context. • Document all incidents accurately: symptoms, actions taken, resolution, recovery time and lessons learned, contributing to the team's knowledge base. • Participate in on-call rotations, ensuring coverage and response times within established SLAs. • Continuously monitor infrastructure dashboards and alerts, acting proactively before degradations become critical incidents. • Investigate capacity, performance, availability and storage alerts in cloud (AWS, Azure, GCP) and on-premises environments, taking corrective actions or escalating with full context. • Fulfill infrastructure requests (provisioning, resource adjustments, access creation, configurations) within established deadlines and standards. • Execute routine operational tasks: patches, backups, capacity checks, cleanup of obsolete resources and inventory updates. • Plan, document and execute Change Management (GMUD) activities in production environments, following the ITIL Change Management process.

🎯 Requirements

• Proven experience operating AWS cloud production environments, capable of diagnosing and resolving incidents without constant supervision. • Strong knowledge of Linux and Windows Server: administration, logs, service troubleshooting and connectivity. • Experience with observability tools (Zabbix, Grafana or CloudWatch) for alert investigation and event correlation. • Experience applying ITIL: incident opening, classification and resolution; executing GMUDs with rollback plans. • Active Directory: user and group creation, GPOs, authentication troubleshooting. • Basic networking: TCP/IP, DNS, DHCP, VPN, firewalls, VLANs — sufficient to diagnose connectivity issues. • Operational-level Bash or PowerShell for automating routine tasks. • Degree in Computer Science, Network Engineering, Information Systems, Systems Analysis and Development or related fields. • Ongoing degree study will be considered if the candidate fully meets practical experience requirements and holds at least one technical certification.

🏖️ Benefits

• Medical Insurance • Dental Insurance (Omint) • Life Insurance • Profit Sharing (PLR) • Performance Bonus (PPR) • "ABC with You": a program supporting employees and their families with legal, social, psychological and financial assistance • Meal Voucher • Food Voucher • Extended Parental Leave: 20 days paternity and 6 months maternity • Childcare/Babysitter Allowance • Annual Day Off • Home Office Infrastructure Allowance • TotalPass

Apply Now

Similar Jobs

🕒 May 19

Stefanini Brasil

10,000+ employees

🤖 Artificial Intelligence

🔒 Cybersecurity

WebsiteLinkedIn

DevOps Specialist at Stefanini ensuring high reliability and scalability of agent platforms. Involves building execution layer, implementing monitoring, and automating deployments.

🗣️🇧🇷🇵🇹 Portuguese Required

AWS

Azure

Cloud

Google Cloud Platform

🕒 May 18

TRACK&FIELD

501 - 1000

🛒 Retail

🛍️ eCommerce

⚽ Sports

WebsiteLinkedIn

SRE/Infra Engineer ensuring the stability and security of TFSports' IT infrastructure. Focusing on high availability and performance for business operations in Brazil.

🗣️🇧🇷🇵🇹 Portuguese Required

AWS

Cloud

DNS

EC2

Flux

Kubernetes

Python

Terraform

🕒 May 14

Avanade

10,000+ employees

☁️ SaaS

🤝 B2B

🏢 Enterprise

WebsiteLinkedIn

DevOps Engineer managing CI/CD pipelines and cloud environments at Avanade. Contributing to strategic digital transformation projects and ensuring platform reliability and scalability.

🗣️🇧🇷🇵🇹 Portuguese Required

AWS

Azure

Cloud

Docker

ETL

Google Cloud Platform

Grafana

Jenkins

Kubernetes

Linux

Prometheus

Python

Terraform

🕒 May 6

INEX

51 - 200

🤝 B2B

🏢 Enterprise

🤖 Artificial Intelligence

WebsiteLinkedIn

SRE Analyst leading observability discipline in IT Operations. Responsible for service level definitions, metrics, troubleshooting, and cloud modernization projects.

🗣️🇧🇷🇵🇹 Portuguese Required

Grafana

🕒 May 4

Novibet

501 - 1000

🎲 Gambling

🎮 Gaming

🛍️ eCommerce

WebsiteLinkedIn

DevOps Engineer at Novibet's Brazilian HQ in São Paulo, responsible for infrastructure maintenance and deployment strategies.

Ansible

AWS

Azure

Chef

Cloud

Docker

Google Cloud Platform

Grafana

HAProxy

Jenkins

Kubernetes

Linux

MongoDB

MySQL

NGINX

OpenShift

PHP

Postgres

Prometheus

Puppet

Python

RabbitMQ

Redis

SaltStack

Splunk

Terraform

Go