Senior Data Engineer, Data and Applied AI

51 - 200 employees

Founded 2020

⚕️ Healthcare Insurance

🧘 Wellness

Healthcare Insurance • Wellness

Plume is a virtual gender-affirming care clinic exclusively for trans and gender non-conforming individuals. The company provides comprehensive gender-affirming hormone therapy and wellness services, accessible through a telehealth model that eliminates traditional barriers to care. Plume focuses on providing trans-centered healthcare, support, and community engagement, offering services like transition guidance, anxiety and depression support, acne treatment, smoking cessation, and more. With services available in 47 states, Plume is committed to supporting the trans community in their gender journey with convenient, affirming care.

Senior Data Engineer, Data and Applied AI

Job not on LinkedIn

🕒 April 10

🇺🇸 United States – Remote

💵 $158k - $168k / year

⏰ Full Time

🟠 Senior

🚰 Data Engineer

🦅 H1B Visa Sponsor

Airflow

Amazon Redshift

Apache

BigQuery

Cloud

Pandas

PySpark

Python

SQL

Tableau

Apply Now

Find Similar Remote Jobs

📊 Check your resume score for this job

Improve your chances of getting an interview by checking your resume score before you apply.

Plume

51 - 200 employees

Founded 2020

⚕️ Healthcare Insurance

🧘 Wellness

Healthcare Insurance • Wellness

📋 Description

• Building and maintaining production-grade data pipelines in cloud data warehouses such as Google BigQuery or equivalent, following architectural standards set by the Director of Data and AI. • Designing and developing dbt models across bronze, silver, and gold layers, including a focus on quality and governance via automated tests, documentation, and incremental load strategies. • Creating and optimizing Airflow DAGs for data workflow orchestration, including scheduling, dependency management, error handling, and alerting. • Implement dimensional data models and data mart structures — guided by the team's modeling standards — that support clinical BI and ML feature consumption. • Crafting easy-to-understand visualizations and dashboards that align with commonly used business analytic standards in Looker or equivalent BI tools in close collaboration with product analytics, finance, operations, growth, and clinical stakeholders. • Integrating healthcare data from sources such as EHRs, Stripe, 3rd-party APIs, and application database feeds, normalizing incoming data into the unified data platform. • Applying HIPAA-compliant data handling practices, including PHI/PII masking, tokenization, audit logging, and role-based access controls across all pipeline and AI system work. • Architecting and implementing RAG pipelines — including document ingestion, chunking, embedding generation, and retrieval — using frameworks such as LangChain or LangGraph • Supporting MLOps workflows, including model training pipeline maintenance, deployment support, performance monitoring, and retraining triggers. • Code reviewing PRs from teammates, providing constructive technical feedback to peers, and upholding the team's engineering standards. • Collaborating closely with product managers to understand requirements and deliver reliable data and AI products. • Monitoring and triaging assigned pipeline and data quality failures, escalating architectural issues as appropriate. • Documenting pipeline designs, data models, and technical decisions in alignment with the team's governance and lineage tracking standards. • Evaluating new tools and frameworks, providing hands-on prototyping and technical assessments.

🎯 Requirements

• 5+ years of hands-on experience in data engineering, analytics engineering, or a closely related role. • 2+ years of experience working within the healthcare industry, including working knowledge of healthcare data standards, clinical workflows, regulated data environments, and domain-specific data visualizations. • Working knowledge of HIPAA — including PHI/PII classification, data masking, audit logging, and access control requirements. • Proven production experience with at least one major cloud data warehouse: BigQuery, Snowflake, or Redshift — including advanced SQL and query optimization. • Strong hands-on experience with dbt (Core or Cloud), including incremental models, tests, documentation, and multi-environment workflows. • Deep experience with Apache Airflow for workflow orchestration, including DAG design, scheduling, monitoring, and failure handling. • Demonstrated knowledge of dimensional data modeling — star/snowflake schemas, SCD Types 1/2, fact and dimension table design. • Hands-on experience delivering dashboards and reports in at least one enterprise BI tool: Looker, Power BI, Tableau, Qlik, etc. • Proficiency in Python for data pipeline development, API integrations, and automation (Pandas, PySpark, or similar). • Practical exposure to RAG pipeline development and LLM integration using LangChain, LangGraph, or LlamaIndex • Hands-on exposure to MLOps concepts — model deployment, monitoring, and retraining workflows • Knowledge of CI/CD tooling for data and AI workloads (GitHub Actions, dbt Cloud CI) • Strong understanding of data quality and governance principles: lineage, access controls, data contracts, and automated testing and experience with data governance tools such as OpenMetadata • Excellent written and verbal communication skills with the ability to collaborate effectively across engineering, analytics, and clinical teams • Ability to work independently on assigned workstreams while keeping the Director and team informed of progress, blockers, and risks

🏖️ Benefits

• Ground-Floor Equity (Series B) • Free Medical, Dental, and Vision on the first of the month after you start full-time work • Unlimited PTO • 11 paid holidays and company shut-down for a week in December • 401(k) • Free Plume and BetterHelp Subscriptions

Apply Now

Similar Jobs

Data Engineer

🕒 April 10

Canoe Intelligence

51 - 200

💳 Fintech

☁️ SaaS

🤖 Artificial Intelligence

Data Engineer at Canoe Intelligence responsible for designing scalable data systems for alternative investment data processes. Collaborating with AI/ML Engineers and developing data architectures for new products.

🇺🇸 United States – Remote

💵 $110k - $140k / year

💰 $36M Series C - Canoe on 2024-07

⏰ Full Time

🟡 Mid-level

🟠 Senior

🚰 Data Engineer

Airflow

Amazon Redshift

AWS

Cloud

Kafka

Postgres

SQL

Data Architect – Strong Azure Services, Finance Experience (GL, AR, AP)

🕒 April 10

CloudScouts

11 - 50

🤝 B2B

🏢 Enterprise

💸 Finance

Data Architect specializing in financial systems (GL, AR, AP) and using Azure. Fully remote position requiring 12+ years of relevant experience.

🇺🇸 United States – Remote

⏰ Full Time

🟠 Senior

🔴 Lead

🚰 Data Engineer

Amazon Redshift

Azure

BigQuery

Cloud

ERP

ETL

Informatica

Kafka

Oracle

Python

Spark

SQL

Tableau

Senior Technical Product Manager, AI Data Platform

🕒 April 9

Aledade, Inc.

501 - 1000

⚕️ Healthcare Insurance

🏢 Enterprise

Senior Technical Product Manager defining AI Data Platform roadmap for Aledade. Collaborating with AI engineers and clinical leadership to support value-based care workflows.

🇺🇸 United States – Remote

⏰ Full Time

🟠 Senior

🚰 Data Engineer

AWS

Azure

Cloud

Google Cloud Platform

Data Engineer III

🕒 April 7

Wayvia (formerly PriceSpider)

201 - 500

🛍️ eCommerce

🛒 Retail

🤖 Artificial Intelligence

Data Engineer III at Wayvia leading development and maintenance of data infrastructure and tools. Ensuring data integrity while driving strategic direction of data initiatives.

🇺🇸 United States – Remote

💵 $130k - $160k / year

⏰ Full Time

🟡 Mid-level

🟠 Senior

🚰 Data Engineer

Airflow

Java

Python

SQL

Founding Data Engineer – PG, Opensearch

🕒 April 5

Neuroscale AI

11 - 50

🤖 Artificial Intelligence

Founding Data Engineer responsible for building the data infrastructure at Neuroscale AI. Designing and maintaining scalable data systems to support AI-driven products.

🇺🇸 United States – Remote

💵 $100k - $200k / year

⏰ Full Time

🟡 Mid-level

🟠 Senior

🚰 Data Engineer

AWS

Cloud

Distributed Systems

ElasticSearch

ETL

Postgres

Python

SQL

Terraform

TypeScript