Senior Observability Analyst – SRE/Monitoring

Job not on LinkedIn

🕒 May 27

🗣️🇧🇷🇵🇹 Portuguese Required

Apply Now
Find Similar Remote Jobs

📊 Check your resume score for this job

Improve your chances of getting an interview by checking your resume score before you apply.

Logo of GrooveTech

GrooveTech

51 - 200 employees

Founded 2017

🤝 B2B

🏢 Enterprise

🎯 Recruiter

B2B • Enterprise • Recruitment

GrooveTech is a Brazilian IT services firm that provides managed staff augmentation and squads, software quality assurance, 24x7 NOC monitoring, and strategic IT consulting including digital due diligence for M&A. They supply managed teams (Executives of IT, PM/PO, technical leads and business partners), daily reporting, and focus on ROI by reducing turnover and improving project performance through testing automation, monitoring (on-prem and cloud), and tailored delivery models.

📋 Description

• Serve as the technical observability lead for high‑criticality environments. • Manage and evolve solutions such as Datadog, Zabbix, and Grafana. • Implement and optimize APM practices, UX monitoring, traces, metrics, and logs. • Use Azure Monitor and Azure Logs for troubleshooting and event correlation. • Design and implement alerting integrations via PagerDuty. • Build and maintain playbooks and runbooks for incident response. • Support root cause analysis and define preventive actions together with infrastructure and application teams.

🎯 Requirements

• Proven experience in Observability, Monitoring, or SRE. • Advanced expertise with Datadog (APM, UX, Traces, Dashboards, and Alerts). • Experience with Zabbix (infrastructure) and Grafana (consolidated dashboards). • Knowledge of Azure Monitor and Azure Logs. • Experience with ITIL processes (Incident, Problem, and Change Management). • Demonstrated ability to create technical documentation (playbooks/runbooks). • Plus: Knowledge of Azure cloud architecture. • Experience with ITSM tools. • Hands-on knowledge of SRE methodologies. • Experience with continuous improvement processes (PDCA).

🏖️ Benefits

• Wellhub (Gympass) • Life insurance • Close support from the staff team and technical mentoring. • Collaborative environment focused on continuous improvement.

Apply Now

Similar Jobs

🕒 May 26

CodiLime

201 - 500

🤝 B2B

📡 Telecommunications

🔧 Hardware

Senior DevOps/SRE Engineer supporting deployment of network automation platform for CodiLime. Collaborating with a team in an agile environment on production deployments in cloud and on-prem settings.

AWS

Cloud

Kubernetes

Linux

Python

Terraform

🕒 May 22

Camp Strategy

1 - 10

☁️ SaaS

Experienced DevOps Engineer supporting cloud infrastructure and deployment platforms at Campspot. Work with AWS, Kubernetes, Terraform, and observability in a remote role.

AWS

Cloud

Docker

EC2

Grafana

Kubernetes

Linux

Prometheus

Terraform

🕒 May 20

CodiLime

201 - 500

🤝 B2B

📡 Telecommunications

🔧 Hardware

Senior Linux/DevSecOps Engineer at CodiLime focusing on platform hardening with strong Linux and Python expertise. Join a passionate team in a fully remote setup.

Linux

Postgres

Python

SQL

🕒 May 12

CodiLime

201 - 500

🤝 B2B

📡 Telecommunications

🔧 Hardware

Senior DevSecOps Engineer developing security tools and mechanisms for client platforms while collaborating with a dedicated project team in Brazil.

Linux

Postgres

Python

SQL

🕒 April 10

CMG (Capital Markets Gateway)

51 - 200

💳 Fintech

💸 Finance

🏢 Enterprise

Site Reliability Engineer ensuring reliability, performance, and scalability in fintech through monitoring and observability. Collaborating with teams to optimize operational processes and enhance system performance.

Azure

Cloud

Docker

Grafana

Kubernetes

Linux

Postgres

Prometheus

Python

Terraform