SRE Pleno – Tarde/Noite

1001 - 5000 employees

Founded 1989

🏦 Banking

💸 Finance

💳 Fintech

Banking • Finance • Fintech

Banco ABC Brasil is a financial institution specialized in providing customized financial solutions to individuals and businesses. With a highly skilled multidisciplinary team, it focuses on promoting growth through long-term relationships based on loyalty, transparency, and results. They offer services such as personal investment banking, corporate financial management, insurance brokerage, and energy market solutions, aiming to help clients maximize their financial outcomes.

SRE Pleno – Tarde/Noite

🕒 May 19

🏢🏡 São Paulo – Hybrid

⏰ Full Time

🟡 Mid-level

🟠 Senior

⛑ DevOps & Site Reliability Engineer (SRE)

🗣️🇧🇷🇵🇹 Portuguese Required

AWS

Azure

Cloud

DNS

Firewalls

Google Cloud Platform

Grafana

Linux

TCP/IP

Apply Now

Find Similar Remote Jobs

📊 Check your resume score for this job

Improve your chances of getting an interview by checking your resume score before you apply.

Banco ABC Brasil

1001 - 5000 employees

Founded 1989

🏦 Banking

💸 Finance

💳 Fintech

Banking • Finance • Fintech

📋 Description

• Act as the first-response point (N1/N2) for incident handling in cloud (AWS, Azure, GCP) and on-premises environments, performing triage, severity classification and formal logging following ITIL. • Perform initial incident diagnosis, investigating root causes using logs, metrics and observability events (Zabbix, Grafana, CloudWatch and Dynatrace). • Escalate correctly to N2/N3 when the incident exceeds the level's scope, ensuring accurate handover of information and context. • Document all incidents accurately: symptoms, actions taken, resolution, recovery time and lessons learned, contributing to the team's knowledge base. • Participate in on-call rotations, ensuring coverage and response times within established SLAs. • Continuously monitor infrastructure dashboards and alerts, acting proactively before degradations become critical incidents. • Investigate capacity, performance, availability and storage alerts in cloud (AWS, Azure, GCP) and on-premises environments, taking corrective actions or escalating with full context. • Fulfill infrastructure requests (provisioning, resource adjustments, access creation, configurations) within established deadlines and standards. • Execute routine operational tasks: patches, backups, capacity checks, cleanup of obsolete resources and inventory updates. • Plan, document and execute Change Management (GMUD) activities in production environments, following the ITIL Change Management process.

🎯 Requirements

• Proven experience operating AWS cloud production environments, capable of diagnosing and resolving incidents without constant supervision. • Strong knowledge of Linux and Windows Server: administration, logs, service troubleshooting and connectivity. • Experience with observability tools (Zabbix, Grafana or CloudWatch) for alert investigation and event correlation. • Experience applying ITIL: incident opening, classification and resolution; executing GMUDs with rollback plans. • Active Directory: user and group creation, GPOs, authentication troubleshooting. • Basic networking: TCP/IP, DNS, DHCP, VPN, firewalls, VLANs — sufficient to diagnose connectivity issues. • Operational-level Bash or PowerShell for automating routine tasks. • Degree in Computer Science, Network Engineering, Information Systems, Systems Analysis and Development or related fields. • Ongoing degree study will be considered if the candidate fully meets practical experience requirements and holds at least one technical certification.

🏖️ Benefits

• Medical Insurance • Dental Insurance (Omint) • Life Insurance • Profit Sharing (PLR) • Performance Bonus (PPR) • "ABC with You": a program supporting employees and their families with legal, social, psychological and financial assistance • Meal Voucher • Food Voucher • Extended Parental Leave: 20 days paternity and 6 months maternity • Childcare/Babysitter Allowance • Annual Day Off • Home Office Infrastructure Allowance • TotalPass

Apply Now

Similar Jobs

Especialista DevOps

🕒 May 19

Stefanini Brasil

10,000+ employees

🤖 Artificial Intelligence

🔒 Cybersecurity

DevOps Specialist at Stefanini ensuring high reliability and scalability of agent platforms. Involves building execution layer, implementing monitoring, and automating deployments.

🏢🏡 São Paulo – Hybrid

⏰ Full Time

🟡 Mid-level

🟠 Senior

⛑ DevOps & Site Reliability Engineer (SRE)

🗣️🇧🇷🇵🇹 Portuguese Required

AWS

Azure

Cloud

Google Cloud Platform

Senior SRE / Infra Engineer

🕒 May 18

TRACK&FIELD

501 - 1000

🛒 Retail

🛍️ eCommerce

⚽ Sports

SRE/Infra Engineer ensuring the stability and security of TFSports' IT infrastructure. Focusing on high availability and performance for business operations in Brazil.

🏢🏡 São Paulo – Hybrid

⏰ Full Time

🟡 Mid-level

🟠 Senior

⛑ DevOps & Site Reliability Engineer (SRE)

🗣️🇧🇷🇵🇹 Portuguese Required

AWS

Cloud

DNS

EC2

Flux

Kubernetes

Python

Terraform

DevOps Analyst

🕒 May 14

Avanade

10,000+ employees

☁️ SaaS

🤝 B2B

🏢 Enterprise

DevOps Engineer managing CI/CD pipelines and cloud environments at Avanade. Contributing to strategic digital transformation projects and ensuring platform reliability and scalability.

🏢🏡 São Paulo – Hybrid

⏰ Full Time

🟡 Mid-level

🟠 Senior

⛑ DevOps & Site Reliability Engineer (SRE)

🗣️🇧🇷🇵🇹 Portuguese Required

AWS

Azure

Cloud

Docker

ETL

Google Cloud Platform

Grafana

Jenkins

Kubernetes

Linux

Prometheus

Python

Terraform

Senior SRE Analyst

🕒 May 6

INEX

51 - 200

🤝 B2B

🏢 Enterprise

🤖 Artificial Intelligence

SRE Analyst leading observability discipline in IT Operations. Responsible for service level definitions, metrics, troubleshooting, and cloud modernization projects.

🏢🏡 São Paulo – Hybrid

⏰ Full Time

🟠 Senior

⛑ DevOps & Site Reliability Engineer (SRE)

🗣️🇧🇷🇵🇹 Portuguese Required

Grafana