Data Engineer

51 - 200 employees

💰 Corporate Round on 2022-10

Neurons Lab is a globally distributed AI R&D company that helps deep tech innovators to accelerate data-driven products development and launch. Our team has expertise in fundamental sciences, full-stack AI/ML engineering, and product design. Such a rare combination and access to scarce talent allows Neurons Lab to build disruptive solutions for clients in HealthTech and EnergyTech industries. Neurons Lab operates within a proprietary delivery framework that is tailored to the innovation environment: fierce competition, tight timelines, little-to-none datasets, and the necessity to generate novel solutions.

Data Engineer

Job not on LinkedIn

🔥 0 minutes ago

🇵🇱 Poland – Remote

⏰ Full Time

🟡 Mid-level

🟠 Senior

🚰 Data Engineer

Airflow

AWS

Spark

SQL

Apply Now

Find Similar Remote Jobs

📊 Check your resume score for this job

Improve your chances of getting an interview by checking your resume score before you apply.

Neurons Lab

51 - 200 employees

💰 Corporate Round on 2022-10

📋 Description

• Reproduce a descriptive-statistics report end-to-end so any figure traces back to raw source — closing the gap the client admitted (numbers they can't currently defend). • Profile and reconcile differing source schemas across acquired entities: map differing field names, types, encodings and business definitions for the same concept into one conformed model. • Build dbt staging → intermediate → mart models with tests; codify the harmonized definitions the Data Science Lead specifies. • Write Great Expectations suites (null / range / uniqueness / referential checks) and wire them into the pipeline so bad data fails loudly rather than silently corrupting analysis. • Implement entity / identity resolution (deterministic + fuzzy matching) where there is no clean shared key for the same customer or account across sources. • Implement and verify anonymization / pseudonymization (hashing / tokenization / k-anonymity) and evidence that re-identification risk is controlled for the client's IT / compliance team. • Optimize Spark / Glue jobs over tens of millions of rows — partitioning, file formats (Parquet), incremental loads, cost control. • Orchestrate with Airflow / Step Functions; build repeatable, scheduled pipelines rather than one-off scripts. • Prepare clean, documented, feature-ready datasets for the PD / delinquency models. • Document runbooks so the offshore team can operate the pipelines and handover takes days, not weeks; help scope onboarding of the remaining (Ireland + additional) sources.

🎯 Requirements

• 4+ years in data engineering, with strong AWS + Spark / SQL at scale • Demonstrated experience harmonizing / integrating data across multiple source systems • Experience building validated, reproducible pipelines in a regulated environment (BFSI, healthcare, government) — strong plus • Comfortable stepping into a messy, partly-built data estate and bringing it up to standard • Comfortable as the sole or lead data engineer on a small (3–4 person) delivery pod

🏖️ Benefits

• Full-time engagement preferable.

Apply Now

Similar Jobs

Senior Data Engineer

🕒 Yesterday

Sigma Software Group

1001 - 5000

🎮 Gaming

📡 Telecommunications

Senior Data Engineer building modern cloud-native data platforms and migrating legacy systems. Collaborates with Machine Learning, Data Science, and Product teams to innovate data infrastructure.

🇵🇱 Poland – Remote

⏰ Full Time

🟠 Senior

🚰 Data Engineer

Airflow

Apache

AWS

Azure

Cloud

ETL

Google Cloud Platform

Kafka

Microservices

PySpark

Python

Spark

SQL

Terraform

Unity

Vault

Senior Data Engineer

🕒 Yesterday

Sigma Software Group

1001 - 5000

🎮 Gaming

📡 Telecommunications

Senior Data Engineer building advanced, cloud-native data platforms and migrating legacy systems to the cloud. Collaborating with teams on AI architectures and data pipelines.

🇵🇱 Poland – Remote

⏰ Full Time

🟠 Senior

🚰 Data Engineer

Airflow

Apache

AWS

Azure

Cloud

ETL

Google Cloud Platform

Kafka

Microservices

PySpark

Python

Spark

SQL

Terraform

Unity

Vault

Data Engineer

🕒 2 days ago

SOFTETA

11 - 50

☁️ SaaS

🏢 Enterprise

🤝 B2B

Senior Data Engineer designing, implementing, and improving data architectures for clients in banking. Collaborating with teams to optimize data processes and maintain high standards.

🇵🇱 Poland – Remote

⏰ Full Time

🟡 Mid-level

🟠 Senior

🚰 Data Engineer

Airflow

Amazon Redshift

AWS

Azure

BigQuery

Cloud

Docker

ETL

Google Cloud Platform

Kafka

Kubernetes

NoSQL

Python

Spark

SQL

Data Engineer

🕒 3 days ago

InPost Group

10,000+ employees

🛍️ eCommerce

🚗 Transport

Data Engineer responsible for designing data pipelines and streaming systems at InPost. Working with cross-functional teams to create data products that power ML models and analytics.

🇵🇱 Poland – Remote

⏰ Full Time

🟡 Mid-level

🟠 Senior

🚰 Data Engineer

🗣️🇵🇱 Polish Required

Apache

AWS

Azure

BigQuery

Cassandra

Cloud

Docker

ETL

Google Cloud Platform

Java

Jenkins

Kafka

MongoDB

NoSQL

Postgres

PySpark

Python

Scala

SOAP

Spark

SQL

Middle Data Engineer, Azure Databricks

🕒 6 days ago

Miratech

501 - 1000