Senior Platform Data Engineer

🕒 Abril 16

🔔 Pennsylvania – Remoto

info

⏰ Tempo Integral

🟠 Sênior

🚰 Engenheiro de Dados

🦅 Patrocina Visto H1B

info

🗣️🇺🇸🇬🇧 Inglês obrigatório

Candidatar-se
Encontrar Vagas Remotas Similares

📊 Verifique sua pontuação de currículo para esta vaga

Melhore suas chances de conseguir uma entrevista verificando sua pontuação de currículo antes de se candidatar.

Logo of Geisinger

Geisinger

10.000+ funcionários

Fundada em 1915

💊 Farmacêutico

🧘 Bem-estar

Healthcare • Pharmaceuticals • Wellness

Geisinger é uma organização de saúde que atende comunidades na Pensilvânia há mais de um século, oferecendo serviços médicos essenciais e soluções inovadoras em saúde. A organização enfatiza a diversidade, o desenvolvimento de carreira e a criação de um ambiente de apoio para seus funcionários, enquanto se esforça para melhorar a saúde e o bem-estar de seus pacientes por meio de diversos programas e iniciativas comunitárias. Geisinger é reconhecida como um dos sistemas de saúde mais inovadores e está comprometida em moldar o futuro da saúde com foco no cuidado ao paciente e na inclusividade.

Descrição

• The Senior Platform Data Engineer owns roadmap, priorities, platform standards, and architecture reviews; provides formal input on performance reviews. • This position makes clinical data ready for AI at scale: owning the shared data products, retrieval infrastructure, and platform administration that the entire AI portfolio depends on. • Owns Real-time data feeds. Reusable clinical data models and feature pipelines. RAG retrieval infrastructure (ingestion, chunking, embeddings, vector DB, retrieval pipelines). • Streams data from Epic SDE, ADT feeds, lab results, and other clinical sources into Databricks for downstream model consumption. • Curates shared clinical feature tables (patient demographics, labs, vitals, diagnoses, utilization history, imaging metadata) in Databricks/Unity Catalog that multiple AI programs consume for model training, validation, and monitoring. • Designs and operates document ingestion pipelines: normalizing clinical documents, policies, guidelines, and unstructured data sources into formats ready for embedding and retrieval. • Implements and optimizes chunking strategies tailored to healthcare content (e.g., preserving clinical note structure, section-aware chunking for guidelines and protocols). • Establishes data quality gates for RAG: automated profiling, completeness checks, and accuracy scoring before content enters the vector store.

🎯 Requisitos

• 5+ years in data engineering, with strong experience building both batch and streaming data pipelines • Expert-level Databricks skills: Delta Live Tables, PySpark, Unity Catalog, Feature Store • Hands-on experience with real-time data ingestion (Kafka, Spark Structured Streaming, or comparable frameworks) • Strong SQL and Python (pandas, PySpark) skills for data transformation and feature engineering • Experience administering Databricks workspaces: cluster policies, compute management, access controls, cost monitoring • Familiarity with clinical data models and healthcare data sources (EHR extracts, ADT feeds, lab results, claims data) strongly preferred • Experience with Epic data extraction methods (SDE, FHIR, epic-ws) a significant plus • Understanding of data governance principles: lineage, quality monitoring, access controls.

🏖️ Benefícios

• We offer healthcare benefits for full time and part time positions from day one, including vision, dental and domestic partners. • We encourage an atmosphere of collaboration, cooperation and collegiality. • We know that a diverse workforce with unique experiences and backgrounds makes our team stronger.

Candidatar-se

Vagas Similares

🕒 Abril 16

CACI International Inc

10.000+ funcionários

🔒 Cibersegurança

HCM Data Migration Middleware Developer supporting large-scale federal HCM modernization program. Designing, building, and maintaining ETL pipelines for transitioning to Oracle Fusion Cloud HCM platform.

🇺🇸 Estados Unidos – Remoto (EUA)

💵 $75.200 - $158.100 / ano

⏰ Tempo Integral

🟡 Pleno

🟠 Sênior

🚰 Engenheiro de Dados

🗣️🇺🇸🇬🇧 Inglês obrigatório

🕒 Abril 16

Cohere

11 - 50

🤖 Inteligência Artificial

🏢 Corporativo

☁️ SaaS

Data Engineer responsible for building foundational infrastructure for AI deployments and product launches. Collaborating with top researchers and engineers in cutting-edge AI systems.

🗣️🇺🇸🇬🇧 Inglês obrigatório

🕒 Abril 16

ReUp Education

51 - 200

📚 Educação

🤝 B2B

🌍 Impacto Social

Senior Manager of Data Engineering leading a team that manages data at ReUp Education. Empowering adult learners and institutions by developing data solutions and pipelines.

🇺🇸 Estados Unidos – Remoto (EUA)

💵 $175.000 - $185.000 / ano

⏰ Tempo Integral

🟠 Sênior

🚰 Engenheiro de Dados

🗣️🇺🇸🇬🇧 Inglês obrigatório

🕒 Abril 16

TENCYS

11 - 50

🤖 Inteligência Artificial

🏢 Corporativo

🤝 B2B

Data Engineer designing and optimizing data solutions at Uni Tencys Systems. Collaborating with teams for machine learning and analytics initiatives while ensuring data quality.

🇺🇸 Estados Unidos – Remoto (EUA)

💰 $35.000 Pre seed em 2024-11

⏰ Tempo Integral

🟠 Sênior

🔴 Especialista

🚰 Engenheiro de Dados

🗣️🇺🇸🇬🇧 Inglês obrigatório

🕒 Abril 15

Common Great

1 - 10

🤲 Filantropia

🤝 Sem Fins Lucrativos

Data Engineer responsible for building efficient data pipelines for Common Room's AI GTM Platform. Collaborating to ensure internal systems serve customer needs effectively.

🇺🇸 Estados Unidos – Remoto (EUA)

⏰ Tempo Integral

🟡 Pleno

🟠 Sênior

🚰 Engenheiro de Dados

🗣️🇺🇸🇬🇧 Inglês obrigatório