February 29
Loading...
• StarTree is seeking exceptional Site Reliability Engineers (SRE), to manage, tune and debug the large-scale highly available distributed systems. You will be working with a team of passionate and talented engineers in automation, tuning, and troubleshooting of Apache Pinot and SQL DBs. We are looking for motivated, hardworking and focused individuals who have a real passion for operational excellence, data systems, and automation. • Responsibilities: - Leverage various monitoring and alerting services to solve intricate programming problems at scale. - Manage and tune multiple critical customer-facing Apache Pinot clusters - Monitor availability, read/write latencies, and other key telemetry to proactively identify SLO misses and help mitigate issues - Build a rapport with and work closely with customers to mitigate and resolve incidents - Execute disaster recovery strategies with minimal downtime - Collaborate with other engineers to understand and troubleshoot systems and use the experience gained to influence the roadmap of other teams
• 5+ years of experience as an engineer (SRE, SDET, or development) • Experience managing highly available production facing distributed systems and in-depth knowledge of Java are a plus • Experience with cloud platforms such as AWS, GCP, or Azure • Experience with Kubernetes and container orchestration • Familiarity with streaming systems, such as Kafka, Pulsar, Flume, Flink, Spark, or similar • Knowledge of standard methodologies related to security, performance, and disaster recovery • Strong troubleshooting and critical thinking skills
Apply NowFebruary 10
February 10
51 - 200
🇮🇳 India – Remote
đź’° $500k Seed Round on 2000-06
⏰ Full Time
🟡 Mid-level
đźź Senior
đź–Ą DevOps & Production Engineering
February 4
February 4
11 - 50
🇮🇳 India – Remote
đź’° $150k Seed Round on 2016-11
⏰ Full Time
🟡 Mid-level
đźź Senior
đź–Ą DevOps & Production Engineering