
Artificial Intelligence • B2B • SaaS
Grupo Protege is an AI training data platform that connects AI developers with high-quality, ethically sourced training data. It serves both AI developers by providing a vast and rich collection of data for model training and data holders by enabling them to monetize their data while maintaining governance and control. The platform aims to streamline the data procurement process significantly, making it easier for developers to access the data they need efficiently.
10,000+ employees
Founded 1971
🤖 Artificial Intelligence
🤝 B2B
☁️ SaaS
September 11

Artificial Intelligence • B2B • SaaS
Grupo Protege is an AI training data platform that connects AI developers with high-quality, ethically sourced training data. It serves both AI developers by providing a vast and rich collection of data for model training and data holders by enabling them to monetize their data while maintaining governance and control. The platform aims to streamline the data procurement process significantly, making it easier for developers to access the data they need efficiently.
10,000+ employees
Founded 1971
🤖 Artificial Intelligence
🤝 B2B
☁️ SaaS
• Manage end-to-end data preparation through QA and delivery, ensuring cross-functional coordination and on-time execution • Perform QA, packaging, and delivery of complex datasets in collaboration with media producers and operational partners • Curate high-quality content samples and datasets for customers across industries and use cases • Search, filter, and tag audiovisual and media content across the catalog using tools and technologies • Translate vague customer requests into specific, well-documented data outputs • Write lightweight SQL queries to explore metadata and source specific content • Support dashboard creation and reporting on content coverage, usage trends, and data health • Identify gaps or underrepresented areas in the content catalog and flag them to the team • Perform quality assurance on data deliveries and help build automated, scalable QA processes • Collaborate with product, partnerships, and licensing teams to meet evolving data needs
• Early-career professional (3-5 years experience) • Background in media, data, information science, or a related field • Strong organizational and research skills • Comfortable navigating large, unstructured datasets • Strong communicator; ability to present ideas internally and maintain external-facing communications • Curious and detail-oriented; ability to identify missing content and propose product-oriented solutions • Basic SQL proficiency (SELECT statements, joins, filters) • Interest in media, entertainment, AI, or content strategy • Self-starter who takes ownership and enjoys wearing multiple hats
Apply Now