Member of Engineering – Pre-training, Data Engineering

🕒 January 29

Apply Now
Find Similar Remote Jobs

📊 Check your resume score for this job

Improve your chances of getting an interview by checking your resume score before you apply.

Logo of poolside

poolside

51 - 200 employees

Founded 2023

🤖 Artificial Intelligence

🏢 Enterprise

Artificial Intelligence • Enterprise

poolside is a frontier AI lab and enterprise platform that builds and deploys foundation models, multi-agent systems, and developer-facing tools focused on automating complex software work. The company specializes in on-prem and VPC deployments, security-first integrations, governance, and connectors to enterprise data sources so organizations can run agents and models inside their own boundaries. Poolside embeds research and engineering with customers to deliver outcome ownership, risk controls, and measurable business impact while advancing toward AGI by starting in high-consequence software environments.

📋 Description

• Build and maintain high-performance pipelines for trillions of tokens. • Deliver diverse and high quality datasets for pre-training foundation models. • Closely work with other teams such as Pretraining, Posttraining, Evals and Product to to ensure alignment on the quality of the models delivered.

🎯 Requirements

• Strong background in building production-grade, distributed data systems for machine learning, with experience in: • Orchestration: Slurm, Airflow, or Dagster • Observability & Reliability: CI/CD, Grafana, Prometheus, etc. • Infra: Git, Docker, k8s, cloud managed services • Batched inference (ex: vLLM) • Performance obsession, especially with large-scale GPU clusters and distributed pipelines • Expert-level python knowledge and ability to write clean and maintainable code • Strong algorithmic foundations • Proficiency with libraries like Polars, Dask, or PySpark • Nice to have: • Experience in building trillion-scale SOTA pretraining datasets • Experience translating research to production at scale • Experience with OCR, web crawling, or evals • Prior experience pre-training LLMs

🏖️ Benefits

• Fully remote work & flexible hours • 37 days/year of vacation & holidays • Health insurance allowance for you and dependents • Company-provided equipment • Wellbeing, always-be-learning and home office allowances • Frequent team get togethers • Great diverse & inclusive people-first culture

Apply Now

Similar Jobs

🕒 January 28

Franciscan Health

10,000+ employees

⚕️ Healthcare Insurance

🤝 Non-profit

Manager of Data Engineering leading a team in building and maintaining data pipelines for enterprise reporting and analytics. Focused on modern cloud data lakehouse platform with medallion architecture.

AWS

Azure

Cloud

ETL

Python

SQL

Tableau

🕒 January 28

CARE

5001 - 10000

⚕️ Healthcare Insurance

📚 Education

🎯 Recruiter

Senior Data Engineer responsible for designing and maintaining data pipelines. Working at Care Access, which focuses on improving health services access.

Airflow

Azure

Cloud

Python

SQL

🕒 January 27

EvenUp

51 - 200

🤖 Artificial Intelligence

☁️ SaaS

Data Migration Specialist role at EvenUp focuses on planning and executing customer data migrations. Collaborates with teams to ensure accurate and efficient data transfer and ongoing support.

Cloud

ETL

Python

SQL

🕒 January 21

Veltris

501 - 1000

🤖 Artificial Intelligence

🤝 B2B

Senior Data Engineer leading design and implementation of data pipelines for telecom operations. Building data models and analytics solutions using SQL, Hevo, and Tableau for insights.

AWS

Cloud

SQL

Tableau

🕒 January 20

Palantir

1 - 10

⚡ Energy

☁️ SaaS

🏢 Enterprise

Palantir Foundry Architect responsible for platform ownership and architecture at a non-profit. Leading enterprise analytics transformation in healthcare and community-focused sectors.

ERP