
10,000+ employees
Founded 1971
đ€ Artificial Intelligence
đ€ B2B
âïž SaaS
Artificial Intelligence âą B2B âą SaaS
Grupo Protege is an AI training data platform that connects AI developers with high-quality, ethically sourced training data. It serves both AI developers by providing a vast and rich collection of data for model training and data holders by enabling them to monetize their data while maintaining governance and control. The platform aims to streamline the data procurement process significantly, making it easier for developers to access the data they need efficiently.
đ„ 0 minutes ago
Improve your chances of getting an interview by checking your resume score before you apply.

10,000+ employees
Founded 1971
đ€ Artificial Intelligence
đ€ B2B
âïž SaaS
Artificial Intelligence âą B2B âą SaaS
Grupo Protege is an AI training data platform that connects AI developers with high-quality, ethically sourced training data. It serves both AI developers by providing a vast and rich collection of data for model training and data holders by enabling them to monetize their data while maintaining governance and control. The platform aims to streamline the data procurement process significantly, making it easier for developers to access the data they need efficiently.
âą Design, build, and operate the ingestion systems that process large volumes of multimodal data into usable, well-structured datasets âą Own the ingestion path end to end, from how data lands to how it is validated, processed, tracked, and made available downstream âą Build modality-specific processing steps for real-world source data, such as medical imaging processing, audio and video metadata extraction, quality validation, and notes processing âą Build parsers, validators, and normalization logic that can systematically handle messy, non-standard, and high-variance source formats âą Turn repeated one-off data handling work into reusable processing patterns, internal tooling, and platform capabilities âą Build for high volume and high throughput, optimizing systems for reliability, cost, and speed âą Work across distributed and parallel compute systems to process workloads that do not fit well on a single machine âą Choose the right execution model for the workload, including batch processing, distributed execution, and modern compute patterns for unstructured data and inference-heavy processing âą Diagnose and resolve bottlenecks across ingestion and processing systems, and keep performance from degrading as volume and modality complexity grow âą Build validation and quality checks that catch bad, incomplete, or malformed data before it propagates downstream âą Handle sensitive and regulated data, including PHI, with the security and care the domain demands, including de-identification where required âą Track provenance, metadata, and usage constraints through the ingestion path so downstream use remains compliant and auditable âą Raise the quality bar for observability, debuggability, and operational reliability across the ingestion layer âą Partner with product and Data Lab to support new modalities, new partner requirements, and non-standard source data âą Work directly with partner engineering teams when needed to translate source-system realities into robust ingestion and processing design âą Surface recurring patterns that are worth standardizing into reusable transforms, validators, and internal tooling âą Help shape how Protege handles new data types as the platform expands into more complex data environments
âą 5+ years building and operating production backend or data systems, with real experience in data processing at scale âą Hands-on experience designing and running large-scale data pipelines âą Strong programming skills in Python âą Experience with distributed data processing âą Strong proficiency with AWS âą Comfort with messy, varied, high-volume data and high ambiguity, with a knack for finding patterns in complex environments âą Attention to detail without losing speed, and a bias to action âą Excited to work on a product built around moving and processing large volumes of data âą Curious, tenacious, and proactive
âą Health insurance âą Professional development opportunities âą Flexible working hours
Apply Nowđ„ 15 minutes ago
Fullstack Tech Lead at Verity focused on implementing agile practices in software development and team leadership. Responsible for the quality of code and direction of technical product decisions.
đŁïžđ§đ·đ”đč Portuguese Required
Angular
AWS
Azure
Cloud
Docker
Google Cloud Platform
Java
JavaScript
jQuery
Kubernetes
Python
SQL
đ„ 15 minutes ago
Fullstack Developer SĂȘnior at Verity focusing on Angular and Java. Engage in digital solution acceleration and transformation projects.
đŁïžđ§đ·đ”đč Portuguese Required
Angular
AWS
Azure
Cloud
Java
SOAP
Spring
Spring Boot
SpringBoot
đ„ 52 minutes ago
Technical leader in software engineering for Sicredi, guiding the development of robust solutions. Collaborate across teams to support engineering culture and deliver value to members.
đŁïžđ§đ·đ”đč Portuguese Required
Apache
Java
Kafka
Kotlin
NoSQL
Scala
SQL
đ„ 1 hour ago
Tech Lead guiding development and architecture for critical credit card solutions. Leading innovation and engineering best practices at EfĂ Bank in Brazil.
đŁïžđ§đ·đ”đč Portuguese Required
AWS
Cloud
EC2
JavaScript
Node.js
Postgres
TypeScript
đ„ 3 hours ago
Technical Leader overseeing ServiceNow development projects at Stefanini. Leading teams and driving technical solutions for business transformation.
đŁïžđ§đ·đ”đč Portuguese Required
ITSM
ServiceNow
SOAP