Senior Platform Engineer, ML Data Systems

Job not on LinkedIn

5 hours ago

Apply Now
Logo of Khan Academy Türkçe

Khan Academy Türkçe

Education

Khan Academy Türkçe is a free educational platform, offering thousands of instructional videos and interactive exercises in Turkish. It serves as both a personal learning resource and a classroom educational tool, covering a wide range of subjects including mathematics, science, computer science, arts, and humanities. The platform caters to all educational levels from elementary to university and emphasizes personalized learning with a vast library that continues to grow as more content is translated into Turkish. Khan Academy Türkçe aims to provide world-class education to anyone, anywhere, without cost, and is utilized by millions of students and teachers across Turkey.

11 - 50 employees

Founded 2012

📚 Education

📋 Description

• Evolve and maintain pipelines for transforming raw trace data into ML-ready datasets. • Clean, normalize, and enrich data while preserving semantic meaning and consistency. • Prepare and format datasets for human labeling, and integrate results into ML datasets. • Develop and maintain scalable ETL pipelines using Airflow, DBT, Go, and Python running on GCP • Implement automated tests and validation to detect data drift or labeling inconsistencies. • Collaborate with AI engineers, platform developers, and product teams to define data strategies in support of continuously improving the quality of Khan’s AI-based tutoring. • Contribute to shared tools and documentation for dataset management and AI evaluation. • Inform our data governance strategies for proper data retention, PII controls/scrubbing, and isolation of particularly sensitive data such as offensive test imagery

🎯 Requirements

• Bachelor’s or Master’s degree in Computer Science, Data Engineering, or a related field. • 5 years of Software Engineering experience with 3+ of those years working with large ML datasets, especially those in open-source repositories such as Hugging Face • Strong programming skills in Go, Python, SQL, and at least one data pipeline framework (e.g., Airflow, Dagster, Prefect). • Experience with data versioning tools (e.g., DVC, LakeFS) and cloud storage systems. • Familiarity with machine learning workflows — from training data preparation to evaluation. • Familiarity with the architecture and operation of large language models, and a nuanced understanding of their capabilities and limitations. • Attention to detail and an obsession with data quality and reproducibility. • Motivated by the Khan Academy mission “to provide a free world-class education for anyone, anywhere." • Proven cross-cultural competency skills demonstrating self-awareness, awareness of other, and the ability to adopt inclusive perspectives, attitudes, and behaviors to drive inclusion and belonging throughout the organization. • Experience with labeling platforms (e.g., Label Studio, Scale AI, Toloka) or human-in-the-loop systems. • Understanding of ML evaluation techniques, including prompt-based and generative model metrics. • Exposure to MLOps practices such as model registry, feature store, or continuous evaluation. • Background in education technology or other human-centered AI applications.

🏖️ Benefits

• Competitive salaries • Ample paid time off as needed – Your well-being is a priority • 8 pre-scheduled Wellness Days in 2026 occurring on a Monday or a Friday for a 3-day weekend boost • Remote-first culture - that caters to your time zone, with open flexibility as needed, at times • Generous parental leave • An exceptional team that trusts you and gives you the freedom to do your best • The chance to put your talents towards a deeply meaningful mission and the opportunity to work on high-impact products that are already defining the future of education • Opportunities to connect through affinity, ally, and social groups • And we offer all those other typical benefits as well: 401(k) + 4% matching & comprehensive insurance, including medical, dental, vision, and life

Apply Now

Similar Jobs

7 hours ago

Senior Platform Engineer at ComboCurve designing and operating the cloud infrastructure. Ensuring reliability, security, and scalability of data systems for the energy sector.

Cloud

DNS

Docker

Google Cloud Platform

Kubernetes

Python

Terraform

Vault

Go

Yesterday

Platform Developer building automated infrastructure components for mission-critical environments at Raft. Working with a highly skilled team on Puppet modules and CI/CD pipelines in a remote setting.

Ansible

Kubernetes

Linux

Puppet

Python

Yesterday

Senior Native Platform Engineer integrating external systems for dental tech startup Dandy. Supporting native functionality and enhancing documentation within a remote team environment.

Electron

GRPC

Qt

Yesterday

Senior Engineer developing data applications for CrowdStrike's cybersecurity platform. Leading technical roadmap and mentoring teams on disaster recovery and data replication solutions.

Apache

AWS

Azure

Cloud

Distributed Systems

ETL

Google Cloud Platform

Java

Kubernetes

NoSQL

Python

Spark

SQL

2 days ago

Platform Engineer building an internal developer platform enabling teams to deploy and manage applications. Part of Infrastructure team focusing on Kubernetes-based solutions and developer tools.

Cloud

Kubernetes

Python

Realm

Terraform

Built by Lior Neu-ner. I'd love to hear your feedback — Get in touch via DM or support@remoterocketship.com