
51 - 200 employees
💰 Corporate Round on 2022-10
Neurons Lab is a globally distributed AI R&D company that helps deep tech innovators to accelerate data-driven products development and launch. Our team has expertise in fundamental sciences, full-stack AI/ML engineering, and product design. Such a rare combination and access to scarce talent allows Neurons Lab to build disruptive solutions for clients in HealthTech and EnergyTech industries. Neurons Lab operates within a proprietary delivery framework that is tailored to the innovation environment: fierce competition, tight timelines, little-to-none datasets, and the necessity to generate novel solutions.
🔥 0 minutes ago
Improve your chances of getting an interview by checking your resume score before you apply.

51 - 200 employees
💰 Corporate Round on 2022-10
Neurons Lab is a globally distributed AI R&D company that helps deep tech innovators to accelerate data-driven products development and launch. Our team has expertise in fundamental sciences, full-stack AI/ML engineering, and product design. Such a rare combination and access to scarce talent allows Neurons Lab to build disruptive solutions for clients in HealthTech and EnergyTech industries. Neurons Lab operates within a proprietary delivery framework that is tailored to the innovation environment: fierce competition, tight timelines, little-to-none datasets, and the necessity to generate novel solutions.
• Reproduce a descriptive-statistics report end-to-end so any figure traces back to raw source — closing the gap the client admitted (numbers they can't currently defend). • Profile and reconcile differing source schemas across acquired entities: map differing field names, types, encodings and business definitions for the same concept into one conformed model. • Build dbt staging → intermediate → mart models with tests; codify the harmonized definitions the Data Science Lead specifies. • Write Great Expectations suites (null / range / uniqueness / referential checks) and wire them into the pipeline so bad data fails loudly rather than silently corrupting analysis. • Implement entity / identity resolution (deterministic + fuzzy matching) where there is no clean shared key for the same customer or account across sources. • Implement and verify anonymization / pseudonymization (hashing / tokenization / k-anonymity) and evidence that re-identification risk is controlled for the client's IT / compliance team. • Optimize Spark / Glue jobs over tens of millions of rows — partitioning, file formats (Parquet), incremental loads, cost control. • Orchestrate with Airflow / Step Functions; build repeatable, scheduled pipelines rather than one-off scripts. • Prepare clean, documented, feature-ready datasets for the PD / delinquency models. • Document runbooks so the offshore team can operate the pipelines and handover takes days, not weeks; help scope onboarding of the remaining (Ireland + additional) sources.
• 4+ years in data engineering, with strong AWS + Spark / SQL at scale • Demonstrated experience harmonizing / integrating data across multiple source systems • Experience building validated, reproducible pipelines in a regulated environment (BFSI, healthcare, government) — strong plus • Comfortable stepping into a messy, partly-built data estate and bringing it up to standard • Comfortable as the sole or lead data engineer on a small (3–4 person) delivery pod
• Full-time engagement preferable.
Apply Now🕒 Yesterday
Senior Data Engineer building modern cloud-native data platforms and migrating legacy systems. Collaborates with Machine Learning, Data Science, and Product teams to innovate data infrastructure.
Airflow
Apache
AWS
Azure
Cloud
ETL
Google Cloud Platform
Kafka
Microservices
PySpark
Python
Spark
SQL
Terraform
Unity
Vault
🕒 Yesterday
Senior Data Engineer building advanced, cloud-native data platforms and migrating legacy systems to the cloud. Collaborating with teams on AI architectures and data pipelines.
Airflow
Apache
AWS
Azure
Cloud
ETL
Google Cloud Platform
Kafka
Microservices
PySpark
Python
Spark
SQL
Terraform
Unity
Vault
🕒 2 days ago
Senior Data Engineer designing, implementing, and improving data architectures for clients in banking. Collaborating with teams to optimize data processes and maintain high standards.
Airflow
Amazon Redshift
AWS
Azure
BigQuery
Cloud
Docker
ETL
Google Cloud Platform
Kafka
Kubernetes
NoSQL
Python
Spark
SQL
🕒 3 days ago
Data Engineer responsible for designing data pipelines and streaming systems at InPost. Working with cross-functional teams to create data products that power ML models and analytics.
🗣️🇵🇱 Polish Required
Apache
AWS
Azure
BigQuery
Cassandra
Cloud
Docker
ETL
Google Cloud Platform
Java
Jenkins
Kafka
MongoDB
NoSQL
Postgres
PySpark
Python
Scala
SOAP
Spark
SQL
🕒 6 days ago
501 - 1000
Middle Data Engineer specializing in Azure Databricks to design and develop modern data pipelines for Miratech. Collaborating on data architectures, enabling advanced analytics and business intelligence.
Azure
ETL
PySpark
Spark
SQL
SSIS