Staff Software Engineer – Cloud Platform, Kafka

🕒 May 17

Apply Now
Find Similar Remote Jobs

📊 Check your resume score for this job

Improve your chances of getting an interview by checking your resume score before you apply.

Logo of Calix

Calix

1001 - 5000 employees

Founded 2000

📡 Telecommunications

☁️ SaaS

🏢 Enterprise

💰 $50M Venture Round on 2009-08

Telecommunications • SaaS • Enterprise

Calix is a comprehensive solutions provider that focuses on enabling broadband service providers (BSPs) to simplify, innovate, and grow their businesses. Through its advanced broadband platform, Calix offers technologies like 10G-PON, 5G, Wi-Fi 7, and more to improve network operations, reduce downtime, and enhance the subscriber experience. The company provides managed services such as SmartLife and SmartHome, which help subscribers operate, secure, and enhance their connected lifestyles. Calix serves diverse provider types including telcos, cable operators, and municipal utilities, helping them deliver critical broadband connectivity and transform community access to digital services. With a focus on cloud technology, analytics, and transformation guidance, Calix empowers service providers to thrive in the digital age.

📋 Description

• Design, provision, and manage Apache Kafka clusters (self-managed on GCP/AWS or via Confluent Platform / MSK). • Configure and tune brokers, ZooKeeper/KRaft, topics, partitions, replication factors, and retention policies for high throughput and low latency. • Perform cluster upgrades, rolling restarts, and broker replacements with zero downtime. • Implement and manage Kafka Connect pipelines for data ingestion and egress across heterogeneous systems. • Administer Kafka Streams and ksqlDB deployments for real-time stream processing workloads. • Maintain Schema Registry and enforce schema governance standards across teams. • Define and track SLIs/SLOs for consumer lag, throughput, end-to-end latency, and broker health. • Design and implement cloud infrastructure using IaC – Terraform • Build automated deployment pipelines for Kafka configuration changes using GitOps workflows (ArgoCD, Flux). • Create self-service tooling and runbooks to reduce toil for development teams. • Automate topic provisioning, ACL management, and schema registration via APIs and CLI tooling. • Integrate tools like GitLab CI/CD, or Cloud Build for automated testing and deployment. • Ensure seamless integration of data pipelines with other GCP services like Big Query, Cloud Storage. • Monitor and Optimize performance, reliability, and cost of Kafka and streaming pipelines • Implement security best practices for GCP resources, including IAM policies, encryption, and network security. • Ensure Observability is an integral part of the infrastructure platforms and provides adequate visibility about their health, utilization, and cost. • Collaborate extensively with cross functional teams to understand their requirements; educate them through documentation/trainings and improve the adoption of the platforms/tools.

🎯 Requirements

• 10+ years of overall experience in DevOps cloud engineering, or data engineering. • 5+ years of experience in Kafka at production scale. • Deep expertise in Kafka internals: replication protocol, log compaction, consumer group coordination, partition leadership, and KRaft mode • Proficiency with container orchestration (Kubernetes / Helm) and deploying Kafka via Strimzi, Confluent Operator, or equivalent • Strong understanding of networking (VPC, peering, private endpoints, DNS, load balancing) in cloud environments. • Hands-on experience with Kafka Connect, Schema Registry, and at least one stream processing framework (Kafka Streams, Flink, Spark Structured Streaming). • Proficiency in Google Cloud Platform (GCP) services, including Dataflow, Pub/Sub, Kafka, Dataproc, Big Query, and Cloud Storage. • Expertise in Infrastructure as Code (IaC) tools like Terraform or Cloud Deployment Manager. • Familiarity with data orchestration tools like Apache Airflow or Cloud Composer. • Experience with CI/CD tools like Jenkins, GitLab CI/CD, or Cloud Build. • Knowledge of containerization and orchestration tools like Docker and Kubernetes. • Strong scripting skills for automation (e.g., Bash, Python). • Experience with monitoring tools like Cloud Monitoring, Prometheus, and Grafana. • Familiarity with logging tools like Cloud Logging or ELK Stack. • Strong problem-solving and analytical skills. • Excellent communication and collaboration abilities. • Ability to work in a fast-paced, agile environment.

🏖️ Benefits

• As a part of the total compensation package, this role may be eligible for a bonus.

Apply Now

Similar Jobs

🕒 May 16

NVIDIA

10,000+ employees

🤖 Artificial Intelligence

🎮 Gaming

Principal Software Engineer at NVIDIA building software systems for rack-scale infrastructure capabilities. Collaborating across teams to develop dependable, manageable, and programmable solutions for AI-powered applications.

Cloud

Distributed Systems

Kubernetes

Linux

Open Source

Rust

Go

🕒 May 16

Forward Financing

201 - 500

💸 Finance

💳 Fintech

🤝 B2B

Staff Software Engineer leading frontend systems at a fintech company focused on empowering small businesses with flexible funding options. Setting technical direction and promoting operational excellence within engineering teams.

🇺🇸 United States – Remote

💵 $170k - $200k / year

💰 $250M Debt Financing on 2021-05

⏰ Full Time

🔴 Lead

🧑‍💻 Full-stack Engineer

🕒 May 16

Fieldguide

11 - 50

🤖 Artificial Intelligence

💸 Finance

☁️ SaaS

Staff Platform Engineer designing and building foundational platform services for Fieldguide, a fintech company automating assurance and audit work. Leading technical architecture and mentoring engineers across teams.

AWS

Cloud

Distributed Systems

Kubernetes

🕒 May 16

Stratus

501 - 1000

🤝 B2B

🏢 Enterprise

🤖 Artificial Intelligence

Principal Full Stack Engineer for Stratus, delivering innovative SaaS solutions for MEP contractors. Leading product development in a collaborative, cross-functional Labs team.

AWS

Azure

JavaScript

Kubernetes

Node.js

NoSQL

SQL

TypeScript

Vue.js

.NET

🕒 May 16

Praia Health

11 - 50

⚕️ Healthcare Insurance

☁️ SaaS

Staff Software Engineer responsible for data infrastructure at Praia Health, focusing on scalable healthcare solutions and enterprise integrations.

Apache

AWS

Azure

Cloud

Distributed Systems

Google Cloud Platform

Java

Kubernetes

Python

SDLC

Spark

Terraform