Senior/Staff/Principal Software Engineer – Observability Engineering

🕒 May 11

Apply Now
Find Similar Remote Jobs

📊 Check your resume score for this job

Improve your chances of getting an interview by checking your resume score before you apply.

Logo of AppGate

AppGate

501 - 1000 employees

🔒 Cybersecurity

🏢 Enterprise

Cybersecurity • Enterprise

AppGate is a global cybersecurity company that delivers high-performance Zero Trust Network Access (ZTNA) solutions for enterprises and government agencies. Its platform enforces identity-based, adaptive access policies using real-time risk scoring, AI-powered application discovery, and a direct-routed architecture designed to avoid cloud bottlenecks and scale with demanding environments. AppGate also provides professional services and cyber advisory offerings — including adversary simulation, penetration testing, and third-party access risk assessments — to help organizations implement and operationalize Zero Trust controls.

📋 Description

• Own the end-to-end design and implementation of the AppGate observability fabric — from telemetry SDKs in our clients and gateways, to the LogForwarder pipeline, to customer-side integrations. • Make foundational technical decisions — transport protocols, sampling strategies, schema design, correlation models — that determine whether our platform scales gracefully to hundreds of millions of events per day. • Enable next-generation capabilities, including OpenTelemetry-Native Telemetry Fabric, High-Cardinality Data Pipeline, End-to-End Distributed Tracing, On-Demand Packet Capture, and more. • Define telemetry schema, correlation model, transport, and sampling strategies spanning client devices, controllers, and gateways. • Validate at Customer Scale: Test in lab environments matching our largest deployments and hunt down cardinality explosions and pipeline backpressure before customers see them. • Drive Integration Standards: Own the OTLP, Prometheus, and JSON-log compatibility surface and validate ingestion into Datadog, Splunk, Nexthink, and Elastic. • Collaborate Cross-Functionally: Work directly with product, R&D, and marquee customers in defense and critical infrastructure to shape requirements and deliver outcomes that matter.

🎯 Requirements

• 8+ years of engineering experience with at least 4 years dedicated to observability, telemetry, or large-scale data infrastructure (Datadog, Splunk, Elastic, Honeycomb, New Relic, Grafana Labs, or equivalent). • Deep OpenTelemetry expertise: OTLP, the OTel Collector, semantic conventions, context propagation, and head/tail sampling — you can debate the trade-offs in your sleep. • Distributed tracing in production: You’ve designed or significantly contributed to a tracing system handling real customer traffic, not just a side project. • High-throughput pipeline experience: Hands-on with systems ingesting 100M+ events per day, including back-pressure handling, batching, and storage trade-offs. • Strong systems programming: Production Go and/or Rust preferred. Comfort across the stack, from agent code to backend services. • Networking and security fluency: Comfortable with TLS, DNS, TCP, and identity protocols. Prior ZTNA, SASE, or SD-WAN experience is a strong plus. • Mindset: Pragmatic, opinionated, and impact driven. You know when to prototype and when to ship.

Apply Now

Similar Jobs

🕒 May 11

Twilio

5001 - 10000

Developer Evangelist specializing in AI technology for Twilio, inspiring developers and building community in the Bay Area. Engage in cross-functional collaboration while elevating the developer experience.

Java

Node.js

PHP

Python

.NET

🕒 May 11

Arctiq

201 - 500

🏢 Enterprise

☁️ SaaS

🔐 Security

Team Lead in Data & AI Engineering at Arctiq, leading solution engineers on fixed-price projects. Focusing on delivery lifecycle, client relationships, and team development.

Airflow

BigQuery

Cloud

Python

Spark

SQL

🕒 May 11

Gainwell Technologies

10,000+ employees

⚕️ Healthcare Insurance

Sr. Batch Developer at Gainwell using programming and analytical skills for healthcare administration products. Part of an innovative team focused on the health and well-being of vulnerable communities.

Linux

Unix

🕒 May 10

Guidehouse

10,000+ employees

Celonis Developer developing and maintaining ETL solutions and data models for clients at Guidehouse. Collaborating with Celonis teams to deliver actionable insights and ensuring data integrity.

ERP

ETL

Kafka

Python

SQL

🕒 May 9

Bixal

51 - 200

🏛️ Government

🤝 B2B

Mobile Application Developer joining Bixal to support FEMA's digital services. Collaborating to design and develop secure, user-centered mobile applications for critical information delivery.

Android

AWS

Cloud

Cyber Security

Drupal

iOS

Java

JavaScript

MySQL

PHP

Ruby

Ruby on Rails