Core & ML Ops Team Lead

3 days ago

Apply Now
Logo of Zyte

Zyte

API • Artificial Intelligence • eCommerce

Zyte is a company specialized in providing web data extraction solutions that are now powered by AI. They offer a range of products including the Zyte API, which handles ban management, AI scraping, and proxy solutions for seamless web data collection. Additionally, Zyte provides scalable cloud hosting for Scrapy spiders, enabling efficient management and automation of web scraping tasks. Zyte caters to a diverse set of industries, providing data for e-commerce, news, AI, job postings, real estate, and more, while emphasizing legal compliance in their operations. With a focus on streamlined and cost-effective data extraction services, Zyte supports businesses in accelerating their data projects and enhancing data-driven decision-making.

201 - 500 employees

Founded 2015

🔌 API

🤖 Artificial Intelligence

🛍️ eCommerce

💰 $3M Debt Financing on 2021-12

📋 Description

• Design and evolve the core platform (Kubernetes, Mesos, GPU scheduling/autoscaling, distributed compute). • Own the **model platform**: registry, experiment tracking, training orchestration, evaluation, serving, and monitoring. • Build the **Golden Path**: reference repos, a scaffold CLI, opinionated CI/CD pipelines, runtime contracts (health/metrics/tracing/SLOs), high-performance clients, circuit breakers and other production‑ready defaults. • Operate a secure, multi‑tenant **model registry** and training platform with standardized experiment/evaluation harnesses. • Provide turnkey **serving** patterns (online + batch), drift/quality monitoring, and rollback playbooks. • Integrate public/open‑source AI capabilities as managed platform services with cost and data‑governance guardrails. • Run the squad: roadmap/prioritization, delivery, mentoring, and high engineering standards. • Partner with product engineering (Zyte API, Scrapy Cloud), Prod Ops, and Security on adoption and rollout plans. • Mentor the team and foster a platform-thinking mindset. • Ownership Areas: Container orchestration (Kubernetes/Knative), GPU provisioning & autoscaling, environment & secret management. • **Operators, sidecars, and internal SDKs/libraries** (Go/Rust/Python/Java) that enforce the golden path contract. • Model platform: registry, experiment tracking, training orchestration, evaluation framework, serving infra, model monitoring. • Observability: logging/metrics/tracing pipelines; • Billing pipeline: metering/events/cost tracking abstractions. • **Golden Path**: Java, Python, ML templates + CI/CD blueprints + docs + scaffold CLI. • Reliability enablement (SRE practices), cost governance, supply‑chain security (SBOM, image signing).

🎯 Requirements

• 5+ years experience building distributed systems; 3+ years in MLOps/ML platform engineering (or equivalent impact). • Knowledge of Linux/OS internals (process model, cgroups/namespaces), networking (TCP/IP, HTTP/2), concurrency, and performance profiling. • Deep understanding of Kubernetes (bonus: Mesos) • Proficiency developing high-performance services in Java, Rust, Go or C++ (bonus: familiarity with vert.x and Netty frameworks); strong Python skills. • Experience with GPU infrastructure (scheduling, containerization, optimization). • Track record of designing and operating model platforms (registry, training, serving, monitoring) in production. • Demonstrated success leading technical teams and implementing organization-wide platform solutions. • Streaming & workflows: Kafka plus Argo/Temporal/Airflow or equivalents. • eBPF‑based observability, perf tooling, or io_uring experience • Cost optimization for ML/AI; multi‑tenant quotas and fairness. • Hands‑on experience authoring **Golden Paths** (service chassis/templates, CI/CD blueprints, CLI scaffolds). • SRE practices (SLIs/SLOs, incident management)

🏖️ Benefits

• We love fostering and nourishing new ideas and bringing them to market • Become part of a self-motivated, progressive, multi-cultural team. • Have the freedom and flexibility to work from where you do your best work, as we are a completely remote company. • Get the chance to work with cutting-edge open-source technologies and tools.

Apply Now

Similar Jobs

November 25

intive

1001 - 5000

🤖 Artificial Intelligence

Senior Machine Learning Engineer at intive developing scalable ML architectures and collaborating with cross-functional teams. Exploring advanced ML techniques and mentoring team members.

🇵🇱 Poland – Remote

⏰ Full Time

🟠 Senior

🤖 Machine Learning Engineer

November 13

DocPlanner

1001 - 5000

⚕️ Healthcare Insurance

☁️ SaaS

👥 B2C

Senior Machine Learning Engineer leading ML initiatives for Noa product at Docplanner. Collaborating with cross-functional teams to design and deploy AI-driven solutions in healthcare.

🇵🇱 Poland – Remote

⏰ Full Time

🟠 Senior

🤖 Machine Learning Engineer

November 6

Tidio

51 - 200

🤝 B2B

☁️ SaaS

(Senior) Machine Learning Engineer developing NLP models for Tidio's AI customer service platform. Collaborating with a small team to push the boundaries of conversational AI solutions.

🇵🇱 Poland – Remote

💵 zł23k - zł33k / month

💰 $25M Series B on 2022-05

⏰ Full Time

🟠 Senior

🤖 Machine Learning Engineer

October 24

Ensono

1001 - 5000

Machine Learning Engineer deploying scalable models and collaborating with data scientists at Ensono, a software-first Managed Services Provider delivering AI/ML and automation.

🇵🇱 Poland – Remote

⏰ Full Time

🟡 Mid-level

🟠 Senior

🤖 Machine Learning Engineer

September 19

Kolomolo

11 - 50

AI/ML Engineer building models for sleep disorder detection at Kolomolo. Developing, validating, and deploying ML models for physiological and audio/video data.

🇵🇱 Poland – Remote

⏰ Full Time

🟡 Mid-level

🟠 Senior

🤖 Machine Learning Engineer

Developed by Lior Neu-ner. I'd love to hear your feedback — Get in touch via DM or support@remoterocketship.com