Senior Platform Engineer – DevOps, Infrastructure and Platform

🔥 0 minutes ago

🗣️🇧🇷🇵🇹 Portuguese Required

Apply Now
Find Similar Remote Jobs

📊 Check your resume score for this job

Improve your chances of getting an interview by checking your resume score before you apply.

Logo of OZmap

OZmap

11 - 50 employees

☁️ SaaS

📡 Telecommunications

🤝 B2B

SaaS • Telecommunications • B2B

OZmap is a SaaS platform that provides GIS-based mapping and management for fiber-optic (FTTH) and hybrid network infrastructures, tailored for internet service providers (ISPs). It centralizes georeferenced network documentation, planning, monitoring (including OTDR and switch integration), mobile field apps, and open APIs to streamline provisioning, reduce operational costs, and speed repairs. OZmap also offers data migration, integrations with CRMs/ERPs, dashboards, and tools for commercial viability checks to support network growth, M&A and day-to-day operations.

📋 Description

• Design, operate and evolve AWS (EC2) and on-premises environments with containers (Docker), ensuring availability, security and scalability; • Operate and administer Linux production environments (systemd, kernel/network tuning, I/O, process troubleshooting); • Build and evolve CI/CD pipelines from scratch, including quality and security gates; • Develop end-to-end observability (instrumentation, exporters, PromQL, SLI/SLO, alerts); • Lead advanced troubleshooting, root cause analysis and blameless post-mortems — driving structural change afterwards, not just producing a report; • Implement automation using Infrastructure as Code; • Analyze and optimize cloud costs: rightsizing, usage analysis and proposing data-driven alternatives; • Act as a technical reference for developers and engineers, influencing architecture without relying on formal authority.

🎯 Requirements

• Required: production experience operating core primitives in AWS (~4+ years): EC2, VPC/networking, IAM and security — production operation and technical decision-making; • Linux and networking (~4+ years): server administration and production troubleshooting — disk full, OOM killer, network diagnostics; processes, memory and I/O; • CI/CD built from scratch (~3+ years): pipelines created and evolved by you (GitHub Actions, Jenkins, self-hosted runners, secrets, caching, gates); • End-to-end open-source observability (~2+ years): Prometheus, Grafana, Loki, VictoriaMetrics or equivalents — configured and operated by you, not just used. OpenTelemetry — including instrumentation, exporters, PromQL and SLI/SLO definition; • Operation under managed layers: concrete experience with nginx/HAProxy/Envoy, Linux underneath, and leading the resolution of critical incidents you have driven; • Docker in production (~3+ years): real operation of containers in critical environments — volumes, networking, resource management, graceful shutdown of services; • High autonomy: receives an ambiguous problem ("our observability is weak") and delivers end-to-end; • Ownership and proactivity: anticipates problems before they become incidents; • Clear communication and technical influence, connecting development, infrastructure and business teams; • Conducts post-mortems focused on root cause, organizational learning and continuous improvement, without a blame culture; • Maturity to self-manage while working remotely.

🏖️ Benefits

• 💻 Equipment allowance – to ensure a comfortable work setup; • 💚 Health support – because your well-being matters; • 📚 Education support – we support your continuous development journey; • 🎂 Birthday gift – because we like to celebrate together; • 🏅 Recognition for tenure – your time with us is valued; • 🗣️ Language support – to help you go beyond borders; • 🏋️ TotalPass (for employee use only); • 🌴 Paid leave after 12 months of employment; • 🎉 Online integration events and socials.

Apply Now

Similar Jobs

🔥 1 hour ago

Compass

10,000+ employees

🏠 Real Estate

📱 Media

DevSecOps Security Analyst focusing on security practices and vulnerability management for Compass UOL. Involves collaboration with development teams to implement security measures.

🗣️🇧🇷🇵🇹 Portuguese Required

Azure

Cloud

🔥 8 hours ago

Compass

10,000+ employees

🏠 Real Estate

📱 Media

DevSecOps Security Analyst responsible for identifying vulnerabilities and guiding secure coding. Join Compass UOL in ensuring security in development and CI/CD processes.

🗣️🇧🇷🇵🇹 Portuguese Required

Azure

Cloud

🕒 3 days ago

CI&T

5001 - 10000

🤖 Artificial Intelligence

☁️ SaaS

Analista de SRE/Developer ensuring system resilience and observability at CI&T, leveraging AI and tech-integrated solutions.

🗣️🇧🇷🇵🇹 Portuguese Required

Java

JavaScript

Node.js

🕒 3 days ago

Digibee

51 - 200

☁️ SaaS

🔌 API

🏢 Enterprise

Site Reliability Engineer Specialist managing observability and incident response at Digibee. Leading technical initiatives in a cloud-native integration company.

🗣️🇧🇷🇵🇹 Portuguese Required

ElasticSearch

Java

JavaScript

Kubernetes

Logstash

MongoDB

Node.js

Postgres

Prometheus

RabbitMQ

Redis

🕒 4 days ago

C&A Brasil

10,000+ employees

🛒 Retail

🛍️ eCommerce

👗 Fashion

Lead DevOps Specialist at C&A fostering an innovative and collaborative technology culture. Ensuring high performance and reliability across cloud platforms.

🗣️🇧🇷🇵🇹 Portuguese Required

AWS

Azure

Cloud

Docker

Google Cloud Platform

Grafana

GraphQL

Kubernetes

Linux

OpenShift

Prometheus

Splunk

Terraform