Mid-Level Data Engineer – GCP, DBT

201 - 500 employees

Founded 2010

🔌 API

🤖 Artificial Intelligence

API • Artificial Intelligence • Cloud Solutions

Leega is a leading technology solutions provider in Latin America, specializing in data analytics and cloud solutions. As the first company in the region certified by Google Cloud for Data Analytics, Leega offers a range of services including application development, machine learning, and risk management analytics. The firm partners with major cloud services such as AWS and Microsoft Azure to help businesses enhance their data management and transition effectively to the cloud, ultimately driving digital transformation and innovation.

Mid-Level Data Engineer – GCP, DBT

Job not on LinkedIn

🕒 June 24

🇧🇷 Brazil – Remote

⏰ Full Time

🟡 Mid-level

🟠 Senior

🚰 Data Engineer

🗣️🇧🇷🇵🇹 Portuguese Required

BigQuery

Cloud

ETL

Google Cloud Platform

Hadoop

PySpark

Python

Shell Scripting

Spark

SQL

Apply Now

Find Similar Remote Jobs

📊 Check your resume score for this job

Improve your chances of getting an interview by checking your resume score before you apply.

Leega

201 - 500 employees

Founded 2010

🔌 API

🤖 Artificial Intelligence

API • Artificial Intelligence • Cloud Solutions

📋 Description

• Load/Pipeline Analysis and Planning: • Evaluate the data warehouse architecture and requirements. • Map data, transformations and processes to GCP services (Cloud Storage, BigQuery, Dataproc). • Define the data migration strategy (full load, incremental, CDC). • Develop a data architecture plan on GCP. • Data Design and Modeling on GCP: • Design table schemas in BigQuery considering performance, cost and scalability. • Define partitioning and clustering strategies for BigQuery. • Model data zones in Cloud Storage (Bronze, Silver and Gold). • ELT/ETL Pipeline Development: • Create data transformation routines using Dataproc (Spark) or Dataflow to load data into BigQuery. • Translate business logic and existing transformations into GCP. • Implement data validation and quality mechanisms. • Performance and Cost Optimization: • Optimize BigQuery queries to reduce costs and improve performance. • Tune and optimize Spark jobs on Dataproc. • Monitor and optimize GCP resource usage to control costs. • Data Security and Governance: • Implement and ensure data security in transit and at rest. • Define and enforce IAM policies to control access to data and resources. • Ensure compliance with data governance policies. • Monitoring and Support: • Troubleshoot performance and functionality issues of data pipelines and GCP resources. • Documentation: • Document the architecture, data pipelines, data models and operational procedures. • Communication: • Communicate effectively with team members, stakeholders and other company areas. • Ensure clear communication between architecture definitions and software components, and oversee the evolution and quality of the team's development. • Jira / Agile Methodologies: • Be familiar with agile methodologies, their rituals, and be proficient with Jira.

🎯 Requirements

• Proven experience with DBT for a minimum of 3 years; • Proficiency with: • models (staging, intermediate, marts) • ref() and source() • macros (Jinja) • seeds and snapshots • tests (not null, unique, custom) • Layered organization: • Staging → Transform → Mart (Data Warehouse) • Google Cloud Platform (GCP): • BigQuery: Deep knowledge of data modeling, query optimization, partitioning, clustering, data ingestion (streaming and batch), security and data governance. • Cloud Storage: Experience managing buckets, storage classes, lifecycle policies, access control (IAM) and data security. • Dataproc: Skills in provisioning, configuring and managing Spark/Hadoop clusters, optimizing jobs, and integrating with other GCP services. • Dataflow/Composer/DBT: Knowledge of orchestration and data processing tools for ELT/ETL pipelines. • Cloud IAM (Identity and Access Management): Implementing security policies and granular access control. • VPC, Networking and Security: Understanding of networks, subnets, firewall rules and cloud security best practices. • Programming Languages: • Python and PySpark: Essential for automation scripts, data pipeline development and integration with GCP APIs. • SQL (advanced): For BigQuery, DBT and data transformations. • Shell Scripting: For task automation. • Version Control: • Git/GitHub/Bitbucket.

🏖️ Benefits

• 🏥 Health plan (Porto Seguro) • 🦷 Dental plan (Porto Seguro) • 💰 Profit Sharing (PLR) • 👶 Childcare allowance • 🍽️ Meal and food vouchers (Alelo) • 💻 Home office allowance • 📚 Partnerships with educational institutions • 🚀 Incentives for certifications, including Cloud certifications • 🎁 Livelo points • 🏋️‍♂️ TotalPass • 🧘‍♂️ Mindself

Apply Now

Similar Jobs

Senior Data Engineer

🕒 June 23

GFT Technologies

10,000+ employees

🔒 Cybersecurity

📋 Compliance

☁️ SaaS

Senior Data Engineer designing scalable and reliable data ingestion and transformation solutions. Collaborating with BI teams, managing data pipelines in Data Lake environments, utilizing SAP integration.

🇧🇷 Brazil – Remote

⏰ Full Time

🟠 Senior

🚰 Data Engineer

🗣️🇧🇷🇵🇹 Portuguese Required

Azure

Python

Spark

SQL

Data Architect, AWS

🕒 June 23

Stefanini Brasil

10,000+ employees

🤖 Artificial Intelligence

🔒 Cybersecurity

Data Architect designing and managing data solutions in AWS for Stefanini. Focused on Lakehouse solutions utilizing Databricks.

🇧🇷 Brazil – Remote

⏰ Full Time

🟡 Mid-level

🟠 Senior

🚰 Data Engineer

🗣️🇧🇷🇵🇹 Portuguese Required

Apache

AWS

ETL

PySpark

Spark

SQL

Unity

Senior Data Engineer, FICO

🕒 June 22

CI&T

5001 - 10000

🤖 Artificial Intelligence

☁️ SaaS

Data Engineer at CI&T transforming AI into impactful business solutions. Focus on data pipelines, quality, and collaboration with teams for scalable solutions.

🇧🇷 Brazil – Remote

💰 $5.5M Venture Round on 2014-04

⏰ Full Time

🟠 Senior

🚰 Data Engineer

🗣️🇧🇷🇵🇹 Portuguese Required

Azure

Cloud

ETL

HDFS

PySpark

Python

Spark

SQL

Data Engineer

🕒 June 22

Oscilar

51 - 200

💳 Fintech

🏦 Banking

📋 Compliance

Data Engineer designing ETL pipelines across multiple cloud-native sources for real-time analytics and risk management. Collaborate with teams to safeguard against online fraud in banking.

🇧🇷 Brazil – Remote

⏰ Full Time

🟡 Mid-level

🟠 Senior

🚰 Data Engineer

Airflow

Cloud

ETL

Java

Kafka

Postgres

Python

SQL

Senior Data Engineer, Analytics

🕒 June 22

CI&T

5001 - 10000

🤖 Artificial Intelligence

☁️ SaaS

Data Engineer responsible for constructing and evolving data architecture supporting digital transformation initiatives. Collaborating to develop scalable, reliable, and data-driven solutions.

🇧🇷 Brazil – Remote

💰 $5.5M Venture Round on 2014-04

⏰ Full Time

🟠 Senior

🚰 Data Engineer

🗣️🇧🇷🇵🇹 Portuguese Required

Azure

SQL