Senior Software Engineer, Data Processing

Job not on LinkedIn

đŸ”„ 0 minutes ago

Apply Now
Find Similar Remote Jobs

📊 Check your resume score for this job

Improve your chances of getting an interview by checking your resume score before you apply.

Logo of Grupo Protege

Grupo Protege

10,000+ employees

Founded 1971

đŸ€– Artificial Intelligence

đŸ€ B2B

☁ SaaS

Artificial Intelligence ‱ B2B ‱ SaaS

Grupo Protege is an AI training data platform that connects AI developers with high-quality, ethically sourced training data. It serves both AI developers by providing a vast and rich collection of data for model training and data holders by enabling them to monetize their data while maintaining governance and control. The platform aims to streamline the data procurement process significantly, making it easier for developers to access the data they need efficiently.

📋 Description

‱ Design, build, and operate the ingestion systems that process large volumes of multimodal data into usable, well-structured datasets ‱ Own the ingestion path end to end, from how data lands to how it is validated, processed, tracked, and made available downstream ‱ Build modality-specific processing steps for real-world source data, such as medical imaging processing, audio and video metadata extraction, quality validation, and notes processing ‱ Build parsers, validators, and normalization logic that can systematically handle messy, non-standard, and high-variance source formats ‱ Turn repeated one-off data handling work into reusable processing patterns, internal tooling, and platform capabilities ‱ Build for high volume and high throughput, optimizing systems for reliability, cost, and speed ‱ Work across distributed and parallel compute systems to process workloads that do not fit well on a single machine ‱ Choose the right execution model for the workload, including batch processing, distributed execution, and modern compute patterns for unstructured data and inference-heavy processing ‱ Diagnose and resolve bottlenecks across ingestion and processing systems, and keep performance from degrading as volume and modality complexity grow ‱ Build validation and quality checks that catch bad, incomplete, or malformed data before it propagates downstream ‱ Handle sensitive and regulated data, including PHI, with the security and care the domain demands, including de-identification where required ‱ Track provenance, metadata, and usage constraints through the ingestion path so downstream use remains compliant and auditable ‱ Raise the quality bar for observability, debuggability, and operational reliability across the ingestion layer ‱ Partner with product and Data Lab to support new modalities, new partner requirements, and non-standard source data ‱ Work directly with partner engineering teams when needed to translate source-system realities into robust ingestion and processing design ‱ Surface recurring patterns that are worth standardizing into reusable transforms, validators, and internal tooling ‱ Help shape how Protege handles new data types as the platform expands into more complex data environments

🎯 Requirements

‱ 5+ years building and operating production backend or data systems, with real experience in data processing at scale ‱ Hands-on experience designing and running large-scale data pipelines ‱ Strong programming skills in Python ‱ Experience with distributed data processing ‱ Strong proficiency with AWS ‱ Comfort with messy, varied, high-volume data and high ambiguity, with a knack for finding patterns in complex environments ‱ Attention to detail without losing speed, and a bias to action ‱ Excited to work on a product built around moving and processing large volumes of data ‱ Curious, tenacious, and proactive

đŸ–ïž Benefits

‱ Health insurance ‱ Professional development opportunities ‱ Flexible working hours

Apply Now

Similar Jobs

đŸ”„ 15 minutes ago

Verity Group

51 - 200

đŸ€– Artificial Intelligence

🔒 Cybersecurity

Fullstack Tech Lead at Verity focused on implementing agile practices in software development and team leadership. Responsible for the quality of code and direction of technical product decisions.

đŸ—ŁïžđŸ‡§đŸ‡·đŸ‡”đŸ‡č Portuguese Required

Angular

AWS

Azure

Cloud

Docker

Google Cloud Platform

Java

JavaScript

jQuery

Kubernetes

Python

SQL

đŸ”„ 15 minutes ago

Verity Group

51 - 200

đŸ€– Artificial Intelligence

🔒 Cybersecurity

Fullstack Developer SĂȘnior at Verity focusing on Angular and Java. Engage in digital solution acceleration and transformation projects.

đŸ—ŁïžđŸ‡§đŸ‡·đŸ‡”đŸ‡č Portuguese Required

Angular

AWS

Azure

Cloud

Java

SOAP

Spring

Spring Boot

SpringBoot

đŸ”„ 52 minutes ago

Sicredi

10,000+ employees

🏩 Banking

💾 Finance

Technical leader in software engineering for Sicredi, guiding the development of robust solutions. Collaborate across teams to support engineering culture and deliver value to members.

đŸ—ŁïžđŸ‡§đŸ‡·đŸ‡”đŸ‡č Portuguese Required

Apache

Java

Kafka

Kotlin

NoSQL

Scala

SQL

đŸ”„ 1 hour ago

EfĂ­ Bank

201 - 500

🏩 Banking

💾 Finance

💳 Fintech

Tech Lead guiding development and architecture for critical credit card solutions. Leading innovation and engineering best practices at EfĂ­ Bank in Brazil.

đŸ—ŁïžđŸ‡§đŸ‡·đŸ‡”đŸ‡č Portuguese Required

AWS

Cloud

EC2

JavaScript

Node.js

Postgres

TypeScript

đŸ”„ 3 hours ago

Stefanini Brasil

10,000+ employees

đŸ€– Artificial Intelligence

🔒 Cybersecurity

Technical Leader overseeing ServiceNow development projects at Stefanini. Leading teams and driving technical solutions for business transformation.

đŸ—ŁïžđŸ‡§đŸ‡·đŸ‡”đŸ‡č Portuguese Required

ITSM

ServiceNow

SOAP