Staff Software Engineer – Cloud Platform, Kafka

🕒 May 17

🏄 California – Remote

info

💵 $136k - $265.7k / year

⏰ Full Time

🔴 Lead

🧑‍💻 Full-stack Engineer

🦅 H1B Visa Sponsor

info
Apply Now
Find Similar Remote Jobs

📊 Check your resume score for this job

Improve your chances of getting an interview by checking your resume score before you apply.

Logo of Calix

Calix

1001 - 5000 employees

Founded 2000

📡 Telecommunications

☁️ SaaS

🏢 Enterprise

💰 $50M Venture Round on 2009-08

Telecommunications • SaaS • Enterprise

Calix is a comprehensive solutions provider that focuses on enabling broadband service providers (BSPs) to simplify, innovate, and grow their businesses. Through its advanced broadband platform, Calix offers technologies like 10G-PON, 5G, Wi-Fi 7, and more to improve network operations, reduce downtime, and enhance the subscriber experience. The company provides managed services such as SmartLife and SmartHome, which help subscribers operate, secure, and enhance their connected lifestyles. Calix serves diverse provider types including telcos, cable operators, and municipal utilities, helping them deliver critical broadband connectivity and transform community access to digital services. With a focus on cloud technology, analytics, and transformation guidance, Calix empowers service providers to thrive in the digital age.

📋 Description

• Design, provision, and manage Apache Kafka clusters (self-managed on GCP/AWS or via Confluent Platform / MSK). • Configure and tune brokers, ZooKeeper/KRaft, topics, partitions, replication factors, and retention policies for high throughput and low latency. • Perform cluster upgrades, rolling restarts, and broker replacements with zero downtime. • Implement and manage Kafka Connect pipelines for data ingestion and egress across heterogeneous systems. • Administer Kafka Streams and ksqlDB deployments for real-time stream processing workloads. • Maintain Schema Registry and enforce schema governance standards across teams. • Define and track SLIs/SLOs for consumer lag, throughput, end-to-end latency, and broker health. • Design and implement cloud infrastructure using IaC – Terraform • Build automated deployment pipelines for Kafka configuration changes using GitOps workflows (ArgoCD, Flux). • Create self-service tooling and runbooks to reduce toil for development teams. • Automate topic provisioning, ACL management, and schema registration via APIs and CLI tooling. • Integrate tools like GitLab CI/CD, or Cloud Build for automated testing and deployment. • Ensure seamless integration of data pipelines with other GCP services like Big Query, Cloud Storage. • Monitor and Optimize performance, reliability, and cost of Kafka and streaming pipelines • Implement security best practices for GCP resources, including IAM policies, encryption, and network security. • Ensure Observability is an integral part of the infrastructure platforms and provides adequate visibility about their health, utilization, and cost. • Collaborate extensively with cross functional teams to understand their requirements; educate them through documentation/trainings and improve the adoption of the platforms/tools.

🎯 Requirements

• 10+ years of overall experience in DevOps cloud engineering, or data engineering. • 5+ years of experience in Kafka at production scale. • Deep expertise in Kafka internals: replication protocol, log compaction, consumer group coordination, partition leadership, and KRaft mode • Proficiency with container orchestration (Kubernetes / Helm) and deploying Kafka via Strimzi, Confluent Operator, or equivalent • Strong understanding of networking (VPC, peering, private endpoints, DNS, load balancing) in cloud environments. • Hands-on experience with Kafka Connect, Schema Registry, and at least one stream processing framework (Kafka Streams, Flink, Spark Structured Streaming). • Proficiency in Google Cloud Platform (GCP) services, including Dataflow, Pub/Sub, Kafka, Dataproc, Big Query, and Cloud Storage. • Expertise in Infrastructure as Code (IaC) tools like Terraform or Cloud Deployment Manager. • Familiarity with data orchestration tools like Apache Airflow or Cloud Composer. • Experience with CI/CD tools like Jenkins, GitLab CI/CD, or Cloud Build. • Knowledge of containerization and orchestration tools like Docker and Kubernetes. • Strong scripting skills for automation (e.g., Bash, Python). • Experience with monitoring tools like Cloud Monitoring, Prometheus, and Grafana. • Familiarity with logging tools like Cloud Logging or ELK Stack. • Strong problem-solving and analytical skills. • Excellent communication and collaboration abilities. • Ability to work in a fast-paced, agile environment.

🏖️ Benefits

• As a part of the total compensation package, this role may be eligible for a bonus.

Apply Now

Similar Jobs

🕒 May 16

NVIDIA

10,000+ employees

🤖 Artificial Intelligence

🎮 Gaming

Principal Software Engineer at NVIDIA building software systems for rack-scale infrastructure capabilities. Collaborating across teams to develop dependable, manageable, and programmable solutions for AI-powered applications.

🕒 May 16

Fieldguide

11 - 50

🤖 Artificial Intelligence

💸 Finance

☁️ SaaS

Staff Platform Engineer designing and building foundational platform services for Fieldguide, a fintech company automating assurance and audit work. Leading technical architecture and mentoring engineers across teams.

🕒 May 16

Bamboo Health

501 - 1000

⚕️ Healthcare Insurance

☁️ SaaS

💳 Fintech

Staff Software Engineer developing innovative real-time care intelligence solutions at Bamboo Health. Collaborating on high-impact projects and enhancing workflows through technology advancements.

🕒 May 15

Dropbox

1001 - 5000

🏢 Enterprise

⚡ Productivity

Staff Software Engineer at Dropbox driving performance improvements in web, iOS, and Android applications. Taking ownership of application performance across consumer-facing surfaces for improved user experience.

🕒 May 15

Confluent

1001 - 5000

🤖 Artificial Intelligence

☁️ SaaS

Strategic technical leader defining and driving AI capabilities for Confluent’s productivity. Collaborating across teams to integrate AI and smart automation solutions into the development lifecycle.