Data Engineer III

10.000+ funcionários

Fundada em 1991

🔧 Hardware

🛒 Varejo

Hardware • Manufacturing • Retail

A Dyson é uma empresa de tecnologia única, conhecida por sua inovação e superioridade em engenharia. Originária de uma pequena oficina na zona rural da Inglaterra, a Dyson cresceu e se tornou uma potência global com escritórios em todo o mundo, de Auckland a Zurique, de Xangai a Chicago. O núcleo da empresa gira em torno da engenharia, mas se estende para pioneirismo em diversos setores tecnológicos, incluindo armazenamento de energia, robótica e aprendizado de máquina. Os produtos da Dyson são renomados por sua qualidade e inovação, com um forte foco em design e desenvolvimento, que se reflete em seus escritórios globais vibrantes.

Data Engineer III

Vaga não está no LinkedIn

🕒 Maio 24

🏄 California – Remoto

💵 $104.000 - $153.000 / ano

⏰ Tempo Integral

🟡 Pleno

🟠 Sênior

🚰 Engenheiro de Dados

🦅 Patrocina Visto H1B

🗣️🇺🇸🇬🇧 Inglês obrigatório

AWS

Cloud

Docker

ETL

Heroku

Kubernetes

Python

SDLC

Spark

SQL

Terraform

Unity

Encontrar Vagas Remotas Similares

📊 Verifique sua pontuação de currículo para esta vaga

Melhore suas chances de conseguir uma entrevista verificando sua pontuação de currículo antes de se candidatar.

Dyson

10.000+ funcionários

Fundada em 1991

🔧 Hardware

🛒 Varejo

Hardware • Manufacturing • Retail

Descrição

• Lead architecture and design of complex data pipelines on Databricks lakehouse architecture (Unity Catalog, Delta Lake, Structured Streaming) • Define technical approach for data engineering initiatives, mentor less-senior engineers, and set standards for code quality through leadership and code reviews • Design and build data foundations that enable AI/ML capabilities — feature stores, embedding pipelines, vector search indexes, and model training datasets • Align data engineering solutions with business strategy, including support for Agentic AI workloads • Own health, scalability, and modernization of data infrastructure with Databricks as the strategic platform — including workload migration, compute optimization, and Unity Catalog adoption • Optimize pipeline performance (Delta Lake table layouts, clustering, Z-ordering) and establish monitoring/alerting best practices with clear SLAs • Build data infrastructure supporting Agentic AI systems — real-time data access layers, context retrieval pipelines, and agent-accessible data services • Collaborate cross-functionally with DevOps, Platform Engineering, and MLOps roles to integrate data solutions into the broader technology environment and shared AI infrastructure – Mlflow registries, feature stores, and agent orchestration layers • Provide consultation to Senior Leadership on complex projects and drive continuous improvement initiatives • Champion data governance at all layers for data, models, and AI assets • Implement data quality strategies (master data management, validation rules, Delta Live Tables expectations) to ensure trust in enterprise data • Serve as liaison across data engineering, AI engineering, and business teams; promote data literacy and stewardship

🎯 Requisitos

• Bachelor's in Computer Science, Engineering, or related field (Master's preferred) • 5+ years with Python and SQL in data engineering for big data ML/analytics workloads • 5+ years designing, building, and troubleshooting scalable ETL/ELT pipelines for business-critical production systems • 3+ years with cloud data services (AWS), container orchestration (Docker, Kubernetes), and IaC (Terraform, CloudFormation) • 3+ years architecting ML workflows and data platforms with CI/CD, automated testing, and distributed processing (Spark) • 3+ years collaborating cross-functionally with Data Science, MLOps, Platform Engineering, and DevOps teams • 3+ years implementing data quality testing and optimizing SQL/Python for cost/performance in the cloud • Understanding of the full Data Science SDLC, and experience mentoring engineers • Strongly Preferred - Databricks & AI Platform • 2+ years hands-on with Databricks (Delta Lake, Unity Catalog, Databricks SQL) • Experience with MLflow experiment tracking and model registry workflows • Experience designing pipelines that serve AI/ML inference — real-time feature engineering, embedding generation, and context retrieval for LLM-based systems • Understanding of how data engineering supports Agentic AI: agent-accessible data services, low-latency retrieval, and pipelines enabling autonomous multi-step workflows • Familiarity with Databricks Mosaic AI, Vector Search, and/or Feature Store • FinOps awareness — compute cluster optimization, cost attribution by workload • Familiarity with Salesforce/Heroku data infrastructures • Experience with data virtualization (e.g., Dremio) • Understanding of Platform Engineering concepts and internal developer platforms • Experience migrating from legacy data warehouse/lake to unified lakehouse architecture • Familiarity with Odaseva data security and management

🏖️ Benefícios

• group health insurance benefits (medical, vision, dental) • FSA and HSA healthcare accounts • life and accident insurance • adoption and fertility assistance • paid parental leave of up to 6 weeks • short/long term disability • paid time off for vacation, personal needs, and sick time • up to 17 days of Choice Time Off (CTO) per calendar year • up to 11 paid holidays per calendar year • opportunity to contribute to company's 401(k) savings and investment plan or deferred compensation plan with an employer match of 100% on the first 3% of contributions

Vagas Similares

Senior Data Engineer

🕒 Maio 23

Live Nation Entertainment

10.000+ funcionários

📱 Mídia

Senior Data Engineer developing scalable data pipelines and optimizing performance at Live Nation Entertainment. Collaborating with cross-functional teams and leveraging AI technologies for improved productivity.

🇺🇸 Estados Unidos – Remoto (EUA)

💵 $152.000 - $190.000 / ano

💰 Post-IPO Debt em 2023-01

⏰ Tempo Integral

🟠 Sênior

🚰 Engenheiro de Dados

🗣️🇺🇸🇬🇧 Inglês obrigatório

Cloud

ETL

PySpark

Python

Spark

SQL

Lead Data Engineer

🕒 Maio 23

Blend360

501 - 1000

🤖 Inteligência Artificial

🏢 Corporativo

Lead Data Engineer supporting large-scale healthcare data platform initiative focused on GCP at BLEND360. Design and optimize data pipelines for enterprise analytics and reporting.

🇺🇸 Estados Unidos – Remoto (EUA)

💵 $150.000 - $160.000 / ano

💰 $100.000.000 Private Equity Round em 2022-08

⏰ Tempo Integral

🟠 Sênior

🚰 Engenheiro de Dados

🦅 Patrocina Visto H1B

🗣️🇺🇸🇬🇧 Inglês obrigatório

BigQuery

Cloud

ETL

Google Cloud Platform

Python

SQL

Senior Data Engineer

🕒 Maio 23

Blend360

501 - 1000

🤖 Inteligência Artificial

🏢 Corporativo

Senior Data Engineer developing scalable data solutions for enterprise healthcare analytics initiatives. Supporting large-scale healthcare data platform on Google Cloud Platform (GCP).

🇺🇸 Estados Unidos – Remoto (EUA)

💵 $115.000 - $125.000 / ano

💰 $100.000.000 Private Equity Round em 2022-08

⏰ Tempo Integral

🟠 Sênior

🚰 Engenheiro de Dados

🦅 Patrocina Visto H1B

🗣️🇺🇸🇬🇧 Inglês obrigatório

BigQuery

Cloud

ETL

Google Cloud Platform

Python

SQL

Senior Healthcare Data Engineer

🕒 Maio 22

BerryDunn — Assurance, Tax and Consulting

501 - 1000

Senior Healthcare Data Engineer developing end-to-end data management services for healthcare clients. Engaging in planning, analysis, and implementation of data integration and business intelligence solutions.

🇺🇸 Estados Unidos – Remoto (EUA)

💵 $85.000 - $125.000 / ano

⏰ Tempo Integral

🟠 Sênior

🚰 Engenheiro de Dados

🗣️🇺🇸🇬🇧 Inglês obrigatório

AWS

Azure

ETL

Python

SQL

Tableau

Data Engineer II

🕒 Maio 22

InStride Health

51 - 200

⚕️ Seguro de Saúde

🧘 Bem-estar

Data Engineer II designing and building data infrastructure for InStride Health. Handling ETL/ELT pipelines, data warehouse architecture, and collaborating with analytics teams.

🇺🇸 Estados Unidos – Remoto (EUA)

💵 $110.000 - $125.000 / ano

💰 $26.000.000 Venture Round em 2022-10

⏰ Tempo Integral

🟡 Pleno

🟠 Sênior

🚰 Engenheiro de Dados

🗣️🇺🇸🇬🇧 Inglês obrigatório

Amazon Redshift

AWS

BigQuery

Cloud

ETL

Matillion

Python

SQL