Field Reliability Engineer

Job not on LinkedIn

🔥 4 minutes ago

Apply Now
Find Similar Remote Jobs

📊 Check your resume score for this job

Improve your chances of getting an interview by checking your resume score before you apply.

Logo of Honeycomb.io

Honeycomb.io

51 - 200 employees

Founded 2016

☁️ SaaS

🏢 Enterprise

🤖 Artificial Intelligence

SaaS • Enterprise • Artificial Intelligence

Honeycomb. io is an observability platform designed to provide comprehensive insights into application performance. It unifies logs, metrics, and traces into a single data type, allowing engineers to quickly diagnose and resolve issues. Honeycomb. io offers features like distributed tracing, anomaly detection, and service maps to help teams enhance system visibility and operational efficiency. It integrates with popular cloud services like Amazon Web Services and Kubernetes, and supports technologies such as OpenTelemetry. Honeycomb. io aims to enable engineering teams to deploy confidently, reduce incident response times, and improve overall productivity.

📋 Description

• Own and operate customer-facing managed infrastructure including Refinery as a Service (RaaS) and Honeycomb Private Cloud (HnyPC) deployments across multiple AWS accounts and regions. • Build and maintain Terraform modules, Helm charts, and deployment automation for provisioning and managing customer EKS clusters, collector pools, and Refinery instances. • Design and implement monitoring, alerting, and observability for managed service infrastructure - using Honeycomb to monitor Honeycomb. • Manage scaling, upgrades, and incident response for customer deployments, including capacity planning and cost optimization across AWS infrastructure. • Building autonomous deployment and management tooling for field-operated managed services. • Serve as the senior technical escalation point for our most challenging customer situations - production incidents, complex collector configurations, Refinery tuning, and architecture reviews that exceed the scope of standard technical roles. • Diagnose and resolve deep infrastructure and observability issues spanning distributed systems, Kubernetes clusters, AWS networking (ALBs, PrivateLink, NLBs, VPCs), and polyglot service meshes. • Partner directly with customer SRE, platform, and engineering teams to troubleshoot real-time production issues, often under time pressure and with direct revenue impact. • Participate in an on-call rotation for managed services (Refinery as a Service, Honeycomb Private Cloud), providing Tier 2 escalation support for customer-facing infrastructure issues. • Build and maintain SOPs, runbooks, and diagnostic frameworks that accelerate resolution for the broader field and support teams. • Contribute to and maintain OpenTelemetry distributions, collectors, exporters, and instrumentation libraries that our customers depend on. • Represent Honeycomb in the OpenTelemetry community - participating in SIGs, reviewing PRs, triaging issues, and driving adoption of best practices. • Build reference architectures, sample collector configurations, and integration guides that demonstrate effective instrumentation patterns for common customer environments (Kubernetes, ECS, serverless).

🎯 Requirements

• Serve as the senior technical escalation point for our most challenging customer situations - production incidents, complex collector configurations, Refinery tuning, and architecture reviews that exceed the scope of standard technical roles. • Diagnose and resolve deep infrastructure and observability issues spanning distributed systems, Kubernetes clusters, AWS networking (ALBs, PrivateLink, NLBs, VPCs), and polyglot service meshes. • Partner directly with customer SRE, platform, and engineering teams to troubleshoot real-time production issues, often under time pressure and with direct revenue impact. • Participate in an on-call rotation for managed services (Refinery as a Service, Honeycomb Private Cloud), providing Tier 2 escalation support for customer-facing infrastructure issues. • Build and maintain SOPs, runbooks, and diagnostic frameworks that accelerate resolution for the broader field and support teams. • Own and operate customer-facing managed infrastructure including Refinery as a Service (RaaS) and Honeycomb Private Cloud (HnyPC) deployments across multiple AWS accounts and regions. • Build and maintain Terraform modules, Helm charts, and deployment automation for provisioning and managing customer EKS clusters, collector pools, and Refinery instances. • Design and implement monitoring, alerting, and observability for managed service infrastructure - using Honeycomb to monitor Honeycomb. • Manage scaling, upgrades, and incident response for customer deployments, including capacity planning and cost optimization across AWS infrastructure. • Building autonomous deployment and management tooling for field-operated managed services. • Contribute to and maintain OpenTelemetry distributions, collectors, exporters, and instrumentation libraries that our customers depend on. • Represent Honeycomb in the OpenTelemetry community - participating in SIGs, reviewing PRs, triaging issues, and driving adoption of best practices. • Build reference architectures, sample collector configurations, and integration guides that demonstrate effective instrumentation patterns for common customer environments (Kubernetes, ECS, serverless). • Identify gaps in the open source ecosystem that create friction for customers and either contribute fixes upstream or build bridging solutions. • Contribute features and improvements to Honeycomb’s own open source projects (Refinery, Honeycomb Collector Distro) to support managed service capabilities. • Be the person Solutions Architects call when a deal goes deeper than demo and design - you join calls to troubleshoot live production environments, validate architecture decisions, and provide the infrastructure credibility that closes technical evaluations. • Tag-team with SAs on strategic accounts, owning the infrastructure and data pipeline conversations while they own the product narrative. • Lead architecture reviews, SLO workshops, and instrumentation deep-dives for customers evaluating or expanding Honeycomb - especially in complex environments (multi-cluster Kubernetes, hybrid cloud, high-cardinality workloads). • Step into customer-facing POCs and pilots as the hands-on technical lead, standing up collector pools, configuring Refinery pipelines, and proving out integrations in the customer’s actual environment. • Create feedback loops between the field and product/engineering, surfacing patterns from customer environments that inform roadmap priorities.

🏖️ Benefits

• A stake in our success - generous equity with employee-friendly stock program • It’s not about how strong of a negotiator you are - our pay is based on transparent levels relative to experience • Time to recharge with unlimited PTO • A distributed-first mindset and culture (really!) • Home office, co-working, and internet stipend • Full benefits coverage for employees, with additional coverage available for dependents • Up to 16 weeks of paid parental leave, regardless of path to parenthood • Annual development allowance • And much more...

Apply Now

Similar Jobs

🔥 10 hours ago

Airbnb

5001 - 10000

👥 B2C

🛍️ eCommerce

Senior Software Engineer on Service Tools at Airbnb enabling backend developer productivity. Responsibilities include building next-gen tools and collaborating with engineering leaders to drive success.

Java

Ruby

Go

🕒 5 days ago

GoTo

1001 - 5000

☁️ SaaS

📡 Telecommunications

🏢 Enterprise

Managed Services Field Engineer ensuring customer network compatibility with UCC Products at GoTo. Collaborating on integration and technical support, ensuring customer satisfaction and service delivery.

🗣️🇧🇷🇵🇹 Portuguese Required

Cloud

Firewalls

Python

Switching

🕒 June 3

Malvern Panalytical

1001 - 5000

🔬 Science

💊 Pharmaceuticals

⚡ Energy

Field Service Engineer responsible for installations, repairs, and maintenance of analytical instrumentation in Brazil. Aiming to understand and meet customer needs effectively while maintaining high satisfaction.

🗣️🇧🇷🇵🇹 Portuguese Required

Remote Sensing

🕒 May 21

Field Service Engineer ensuring quality service, maintaining safety standards, and providing technical support for clients in Brazil.

🕒 May 7

Signode

5001 - 10000

🚗 Transport

Senior Field Services Engineer at Signode responsible for troubleshooting and maintaining packaging equipment. Collaborating with clients for installation and training processes.

🗣️🇧🇷🇵🇹 Portuguese Required