Senior Data Engineer – Generative AI, Data and MDM

🔥 0 minutes ago

🗣️🇧🇷🇵🇹 Portuguese Required

Apply Now
Find Similar Remote Jobs

📊 Check your resume score for this job

Improve your chances of getting an interview by checking your resume score before you apply.

Logo of CI&T

CI&T

5001 - 10000 employees

Founded 1995

🤖 Artificial Intelligence

☁️ SaaS

💰 $5.5M Venture Round on 2014-04

Artificial Intelligence • Cloud Services • SaaS

CI&T is a global tech transformation specialist focusing on helping organizations navigate their technology journey. With services spanning from application modernization and cloud solutions to AI-driven data analytics and customer experience, CI&T empowers businesses to accelerate their growth and maximize operational efficiency. The company emphasizes digital product design, strategy consulting, and immersive experiences, ensuring a robust support system for enterprises in various industries.

📋 Description

• Develop, optimize and maintain scalable data pipelines using Databricks, Spark and PySpark. • Work on integration, transformation, cleansing and provisioning of large-scale master/customer data. • Build and maintain cloud-based Big Data products, ensuring scalability, performance, reliability and data quality. • Work with customer master data domain, supporting initiatives related to Golden Record, data quality, governance and MDM. • Design and implement data solutions that support intelligent systems based on Generative AI, agents and multi-agent systems. • Prepare, structure and provision data for consumption by Machine Learning models, LLMs, autonomous agents and intelligent workflows. • Support the construction of agents and multi-agent systems capable of analyzing customer master data at scale. • Help identify patterns, inconsistencies, gaps, duplicates, anomalies and opportunities for improvement in master/customer data. • Develop data mechanisms for generating alerts, explainable recommendations and decision support. • Apply Analytics and Machine Learning techniques for anomaly detection, classification, clustering and scoring of master/customer data. • Support strategies for data qualification, enrichment, prioritization and governance. • Use Generative AI to support the generation, evolution, validation and explanation of business rules. • Explore and support implementation of architectures such as RAG, autonomous agents, multi-agent systems and intelligent workflows. • Work with near real-time data processing using Kafka. • Support the productionization of Machine Learning models on Databricks, applying MLOps concepts. • Create, maintain and evolve CI/CD pipelines using GitHub and GitHub Actions. • Ensure development best practices, ensuring solutions follow standards of quality, efficiency, maintainability and governance. • Optimize the use of available data to maximize its value for business and technology areas. • Collaborate with MDM, Data, Technology and Business teams to ensure developed solutions are measurable, governable and applicable. • Support technical and functional refinements, ensuring clarity, feasibility and alignment of stories to project needs. • Help create clear, well-defined and technically feasible user stories. • Participate in AS-IS and TO-BE design, documenting current and future processes, identifying technical debt, risks and opportunities for improvement. • Develop refined and approved stories, ensuring quality, efficiency and adherence to technical standards. • Identify technical debt and propose continuous improvements in architecture, processes, data and solutions. • Work in partnership with data and business focal points to ensure alignment between technical solutions, best practices and strategic needs.

🎯 Requirements

• Strong experience as a Data Engineer, Senior Data Engineer or equivalent role. • Strong proficiency in Python for data engineering, automation, data analysis and supporting AI solutions. • Solid knowledge of SQL and exploration of structured data. • Experience with Databricks, PySpark and Spark. • Experience building, optimizing and maintaining scalable data pipelines. • Experience in cloud environments, preferably Azure Databricks and GCP. • Knowledge or experience with Gemini to support GenAI solutions. • Knowledge of Generative AI, LLMs and data-driven intelligent applications. • Knowledge of agent architectures, multi-agent systems and autonomous systems. • Experience with RAG, intelligent workflows, data-driven recommendation and cognitive automation. • Experience with applied Machine Learning techniques, including anomaly detection, classification, clustering and scoring. • Experience with near real-time data processing, preferably with Kafka. • Experience with productionizing Machine Learning models on Databricks and MLOps practices. • Experience with CI/CD pipelines, especially GitHub and GitHub Actions. • Mastery of data engineering best practices, versioning, testing, code review and data quality. • Knowledge of programming logic, application development and performance optimization. • Ability to translate business problems into data, analytical and intelligent solutions. • Experience working in agile squads and agile methodologies.

🏖️ Benefits

• Health and dental insurance; • Meal and grocery allowances; • Childcare assistance; • Extended parental leave; • Partnership with gyms and health & wellness professionals via Wellhub (Gympass) TotalPass; • Profit Sharing (PLR); • Life insurance; • Continuous learning platform (CI&T University); • Discount club; • Free online platform dedicated to physical and mental health and wellbeing; • Pregnancy and responsible parenting course; • Partnerships with online course platforms; • Language learning platform; • And many others

Apply Now

Similar Jobs

🔥 1 hour ago

Five Acts

51 - 200

🤖 Artificial Intelligence

🏢 Enterprise

☁️ SaaS

Data Platform Engineer responsible for developing a cloud data platform for analytics solutions. Designing CI/CD pipelines and optimizing cloud infrastructure for data processing.

🗣️🇧🇷🇵🇹 Portuguese Required

Airflow

AWS

Azure

Google Cloud Platform

NoSQL

Python

Spark

Terraform

Go

🔥 7 hours ago

UltraCon Consultoria

11 - 50

☁️ SaaS

📚 Education

Mid-level Networking Engineer Support at UltraCon facilitating mobility services for a client. Managing escalations and ensuring smooth operations with 24/7 shift requirements.

🗣️🇯🇵 Japanese Required

🗣️🇧🇷🇵🇹 Portuguese Required

DNS

🕒 Yesterday

Flash

501 - 1000

🤝 B2B

👥 HR Tech

☁️ SaaS

Engenheiro (a) de Software Senior desenvolvendo soluções em plataforma de gestão de RH e financeiro. Atuando em equipe de engenharia de software com foco em inovação e qualidade.

🗣️🇧🇷🇵🇹 Portuguese Required

GraphQL

JavaScript

MongoDB

Node.js

React

🕒 Yesterday

DOMVS iT

51 - 200

🤝 B2B

🏢 Enterprise

Senior Data Engineer developing and maintaining data solutions in complex corporate environments at DOMVS iT. Collaborating with cross-functional teams and ensuring the performance and quality of data solutions.

🗣️🇧🇷🇵🇹 Portuguese Required

Azure

JavaScript

JMeter

JUnit

🕒 Yesterday

Grupo Protege

10,000+ employees

🤖 Artificial Intelligence

🤝 B2B

☁️ SaaS

Forward Deployed Engineer managing healthcare customer engagements, leading technical delivery and support for project implementations at AI startup. Building trusted relationships and developing technical solutions for clients.