
10,000+ employees
Founded 1915
💊 Pharmaceuticals
🧘 Wellness
Healthcare • Pharmaceuticals • Wellness
Geisinger is a healthcare organization that has been providing accessible medical services for over a century in Pennsylvania. It focuses on meeting the healthcare needs of its communities and is dedicated to innovative patient care. With career opportunities in various fields including nursing, allied health, and administration, Geisinger promotes professional development and a supportive workplace for its employees, emphasizing diversity, equity, and inclusion.
🕒 April 16
Improve your chances of getting an interview by checking your resume score before you apply.

10,000+ employees
Founded 1915
💊 Pharmaceuticals
🧘 Wellness
Healthcare • Pharmaceuticals • Wellness
Geisinger is a healthcare organization that has been providing accessible medical services for over a century in Pennsylvania. It focuses on meeting the healthcare needs of its communities and is dedicated to innovative patient care. With career opportunities in various fields including nursing, allied health, and administration, Geisinger promotes professional development and a supportive workplace for its employees, emphasizing diversity, equity, and inclusion.
• The Senior Platform Data Engineer owns roadmap, priorities, platform standards, and architecture reviews; provides formal input on performance reviews. • This position makes clinical data ready for AI at scale: owning the shared data products, retrieval infrastructure, and platform administration that the entire AI portfolio depends on. • Owns Real-time data feeds. Reusable clinical data models and feature pipelines. RAG retrieval infrastructure (ingestion, chunking, embeddings, vector DB, retrieval pipelines). • Streams data from Epic SDE, ADT feeds, lab results, and other clinical sources into Databricks for downstream model consumption. • Curates shared clinical feature tables (patient demographics, labs, vitals, diagnoses, utilization history, imaging metadata) in Databricks/Unity Catalog that multiple AI programs consume for model training, validation, and monitoring. • Designs and operates document ingestion pipelines: normalizing clinical documents, policies, guidelines, and unstructured data sources into formats ready for embedding and retrieval. • Implements and optimizes chunking strategies tailored to healthcare content (e.g., preserving clinical note structure, section-aware chunking for guidelines and protocols). • Establishes data quality gates for RAG: automated profiling, completeness checks, and accuracy scoring before content enters the vector store.
• 5+ years in data engineering, with strong experience building both batch and streaming data pipelines • Expert-level Databricks skills: Delta Live Tables, PySpark, Unity Catalog, Feature Store • Hands-on experience with real-time data ingestion (Kafka, Spark Structured Streaming, or comparable frameworks) • Strong SQL and Python (pandas, PySpark) skills for data transformation and feature engineering • Experience administering Databricks workspaces: cluster policies, compute management, access controls, cost monitoring • Familiarity with clinical data models and healthcare data sources (EHR extracts, ADT feeds, lab results, claims data) strongly preferred • Experience with Epic data extraction methods (SDE, FHIR, epic-ws) a significant plus • Understanding of data governance principles: lineage, quality monitoring, access controls.
• We offer healthcare benefits for full time and part time positions from day one, including vision, dental and domestic partners. • We encourage an atmosphere of collaboration, cooperation and collegiality. • We know that a diverse workforce with unique experiences and backgrounds makes our team stronger.
Apply Now🕒 April 16
HCM Data Migration Middleware Developer supporting large-scale federal HCM modernization program. Designing, building, and maintaining ETL pipelines for transitioning to Oracle Fusion Cloud HCM platform.
🕒 April 16
Data Architect responsible for designing and building data warehouse solutions at CargoSprint. Collaborating with teams to enhance data quality and accessibility while driving modernization efforts.
🗣️🇪🇸 Spanish Required
🕒 April 16
Data Engineer responsible for building foundational infrastructure for AI deployments and product launches. Collaborating with top researchers and engineers in cutting-edge AI systems.
🕒 April 16
Senior Manager of Data Engineering leading a team that manages data at ReUp Education. Empowering adult learners and institutions by developing data solutions and pipelines.
🕒 April 16
Data Engineer designing and optimizing data solutions at Uni Tencys Systems. Collaborating with teams for machine learning and analytics initiatives while ensuring data quality.