
201 - 500 employees
Founded 2010
🔌 API
🤖 Artificial Intelligence
API • Artificial Intelligence • Cloud Solutions
Leega is a leading technology solutions provider in Latin America, specializing in data analytics and cloud solutions. As the first company in the region certified by Google Cloud for Data Analytics, Leega offers a range of services including application development, machine learning, and risk management analytics. The firm partners with major cloud services such as AWS and Microsoft Azure to help businesses enhance their data management and transition effectively to the cloud, ultimately driving digital transformation and innovation.
🔥 1 minute ago
🗣️🇧🇷🇵🇹 Portuguese Required
Improve your chances of getting an interview by checking your resume score before you apply.

201 - 500 employees
Founded 2010
🔌 API
🤖 Artificial Intelligence
API • Artificial Intelligence • Cloud Solutions
Leega is a leading technology solutions provider in Latin America, specializing in data analytics and cloud solutions. As the first company in the region certified by Google Cloud for Data Analytics, Leega offers a range of services including application development, machine learning, and risk management analytics. The firm partners with major cloud services such as AWS and Microsoft Azure to help businesses enhance their data management and transition effectively to the cloud, ultimately driving digital transformation and innovation.
• You will architect and evolve the datalake that is the company's data nervous system — the foundation that feeds, in real time, the dynamic pricing engine, ML models, and the group's business intelligence. • This is an ownership role: you define the multi-tenant Lakehouse architecture, from streaming to the semantic layer, and are responsible for its reliability, governance, and cost. • Design and evolve the data lake on Apache Iceberg over S3 — well-defined layers, partitioning and compaction, time-travel and support for DELETE/UPDATE for LGPD (Brazilian data protection law). • Build real-time ingestion (Kafka, Flink, CDC with Debezium) with controlled schema evolution (Schema Registry) and delivery guarantees. • Model the transformation layer in dbt and orchestrate batch and quality flows in Airflow, from crawler to backfill. • Maintain metric definitions in Cube.js — the single source that feeds BI and AI agents and ensures consistency across the company. • Operate federated and low-latency OLAP queries over the lake, with cost and access isolation by tenant and performant queries. • Ensure data testing, lineage and cost efficiency, keeping the platform reliable as it scales.
• Strong command of SQL and query optimization in distributed environments (Minimum 5 years). • Python with solid experience in PySpark or distributed processing. • Orchestration (Airflow), ELT and dbt applied at scale (Minimum 4 years). • Streaming (Kafka, Flink) and Lakehouse architectures with Apache Iceberg (Minimum 3 years). • Strong understanding of data governance, quality, and modeling. • Comfortable with AI-assisted development (e.g., Claude Code). • CDC (Debezium) and low-latency OLAP (ClickHouse, Pinot, Trino/Athena). • Semantic layers (Cube.js, dbt) and Data Mesh architectures. • Governance and catalog tools (OpenMetadata, Lake Formation). • Vector databases (Qdrant) and data pipelines for ML.
• Remote work • Project duration: 6 months, with possibility of extension or conversion to permanent employment.
Apply Now🔥 13 hours ago
Junior Data Engineering Analyst at Experian supporting AI solution development and automation in various sectors. Collaborating with experienced professionals to build scalable platforms.
🗣️🇧🇷🇵🇹 Portuguese Required
AWS
Cloud
Docker
NoSQL
Pandas
Python
PyTorch
Scikit-Learn
Spark
SQL
Tensorflow
🔥 16 hours ago
201 - 500
🧬 Biotechnology
🔒 Cybersecurity
📡 Telecommunications
Data Engineer supporting customer analytics team with data ingestion and pipeline maintenance. Involves integration of legacy systems and development using Databricks.
🗣️🇧🇷🇵🇹 Portuguese Required
ETL
PySpark
Spark
SQL
🔥 16 hours ago
10,000+ employees
Data Engineer at Reply specializing in modeling and maintaining Palantir data solutions. Collaborating on AI-driven projects and ensuring data governance and quality.
🗣️🇧🇷🇵🇹 Portuguese Required
PySpark
Python
SQL
🔥 21 hours ago
1 - 10
Senior Software Engineer developing data products for Avra’s AI infrastructure in a remote-first environment. Collaborating with cross-functional teams to build and maintain data systems and services.
🗣️🇧🇷🇵🇹 Portuguese Required
AWS
Cloud
Distributed Systems
Google Cloud Platform
Python
Rust
Go
🕒 Yesterday
Data Engineer II at Experian designing and implementing Data Lake architectures. Collaborating on AI and ML solutions for innovative data-driven insights in various industries.
🗣️🇧🇷🇵🇹 Portuguese Required
Airflow
Apache
PySpark
Python
Scala
Spark
SQL
Terraform