
51 - 200 employees
Founded 2023
🤖 Artificial Intelligence
🏢 Enterprise
Artificial Intelligence • Enterprise
poolside is a frontier AI lab and enterprise platform that builds and deploys foundation models, multi-agent systems, and developer-facing tools focused on automating complex software work. The company specializes in on-prem and VPC deployments, security-first integrations, governance, and connectors to enterprise data sources so organizations can run agents and models inside their own boundaries. Poolside embeds research and engineering with customers to deliver outcome ownership, risk controls, and measurable business impact while advancing toward AGI by starting in high-consequence software environments.
🕒 January 29
Improve your chances of getting an interview by checking your resume score before you apply.

51 - 200 employees
Founded 2023
🤖 Artificial Intelligence
🏢 Enterprise
Artificial Intelligence • Enterprise
poolside is a frontier AI lab and enterprise platform that builds and deploys foundation models, multi-agent systems, and developer-facing tools focused on automating complex software work. The company specializes in on-prem and VPC deployments, security-first integrations, governance, and connectors to enterprise data sources so organizations can run agents and models inside their own boundaries. Poolside embeds research and engineering with customers to deliver outcome ownership, risk controls, and measurable business impact while advancing toward AGI by starting in high-consequence software environments.
• You’ll be working on our data team focused on the quality of the datasets being delivered for training our models. • This is a hands-on role where your #1 mission would be to improve the quality of the pretraining datasets by leveraging your previous experience, intuition and training experiments. • This role particularly focuses on generating synthetic data at scale and determining the best strategies to leverage such data into training large models. • You’ll closely collaborate with other teams like Pretraining, Postraining, Evals, and Product to define high-quality data needs that map to missing model capabilities and downstream use cases. • Staying in sync with the latest research in synthetic data generation and pretraining is key to success in this role. • You will constantly lead original research initiatives through short, time-bounded experiments while deploying highly technical engineering solutions into production. • With the volumes of data to process being massive, you'll have a performant distributed data pipeline together with a large GPU cluster at your disposal. • To deliver large, high-quality, and diverse synthetic datasets mixing natural language and code modalities to train best-in-class coding agents.
• Strong machine learning and engineering background • Experience with Large Language Models (LLM) • Understanding of how LLMs learn • Data ablations and scaling laws • Post-training techniques • Training reasoning and agentic models • Experience with implementing cost-efficient, complex pipelines to generate synthetical datasets at scale optimizing for data quality, correctness, diversity, etc. • Experience with evals tracking model capabilities (general knowledge, reasoning, math, coding, long-context, etc) • Experience in building trillion-scale pretraining datasets, and familiarity with concepts like data curation, deduplication, data mixing, tokenization, curriculum, impact of data repetition, etc. • Excellent programming skills in Python • Strong prompt engineering skills • Experience working with large-scale GPU clusters and distributed data pipelines • Strong obsession with data quality • Research experience: Author of scientific papers on any of the topics: applied deep learning, LLMs, source code generation, etc. - is a nice to have • Can freely discuss the latest papers and descend to fine details • Is reasonably opinionated
• Fully remote work & flexible hours • 37 days/year of vacation & holidays • Health insurance allowance for you and dependents • Company-provided equipment • Wellbeing, always-be-learning and home office allowances • Frequent team get togethers • Great diverse & inclusive people-first culture
Apply Now🕒 January 28
Senior Synon Developer involved in enhancing RxCLAIM/Claim Adjudication systems. Collaborating on changes to claim processing logic and integrations for a tech-forward company.
🇺🇸 United States – Remote
💵 $120k - $140k / year
💰 Post-IPO Debt on 2023-02
⏰ Full Time
🟠 Senior
🖥 Software Engineer
🕒 January 28
Junior/Mid-level CRM Developer responsible for designing and maintaining CRM software solutions. Join a dynamic team to develop applications for Android and iOS platforms.
🕒 January 27
Webflow Developer optimizing marketing website for Harness' AI-powered software delivery platform. Collaborating with teams to ensure seamless and engaging user experiences while maintaining design integrity.
🇺🇸 United States – Remote
💵 $105k - $120k / year
⏰ Full Time
🟡 Mid-level
🟠 Senior
🖥 Software Engineer
🦅 H1B Visa Sponsor
🕒 January 27
501 - 1000
Developing IVR applications for voice contact center systems at Miratech. Collaborating with teams to enhance customer experience through technical improvements.
🕒 January 24
Founding Product Marketer for OpenRouter, focusing on developer messaging and AI content systems. Lead product launches and create engaging content for technical audiences.