Data Engineer – Web Scraping, LLM Pipelines, Scalable Data Infrastructure

Job not on LinkedIn

🔥 0 minutes ago

Apply Now
Find Similar Remote Jobs

📊 Check your resume score for this job

Improve your chances of getting an interview by checking your resume score before you apply.

Logo of NIR-YU

NIR-YU

201 - 500 employees

🎯 Recruiter

👥 HR Tech

🏢 Enterprise

Recruitment • HR Tech • Enterprise

NIR-YU is a company dedicated to empowering small and medium enterprises (SMEs) by providing tailored nearshore staffing solutions. They focus on strategic recruitment of skilled nearshore professionals, offering cost-plus pricing for transparency and affordability. Their services include staff augmentation, staff leasing, talent acquisition, and employer of record (EoR), aimed at helping SMEs gain access to a skilled workforce from Latin America. By facilitating the hiring of remote talent and ensuring compliance with local regulations, NIR-YU allows businesses to enhance their teams with English-speaking professionals, optimize costs, and focus on growth strategies without the burden of setting up foreign entities.

📋 Description

• Build new structured datasets, including scraping accelerators, Form D filings and dynamic web sources. • Develop automated ETL pipelines that parse, clean and transform content using LLMs. • Define and maintain database schemas in Supabase or PostgreSQL. • Create evaluation frameworks to measure and compare LLM performance across pipeline components. • Contribute to the design of scalable data architectures using GCP services. • Improve reliability, observability and deployment workflows for scraping and data processing systems.

🎯 Requirements

• 4+ years of experience building data pipelines, backend services and automated data processing systems. • Strong background in web scraping with tools like Scrapy, Playwright or similar. • Experience deploying pipelines on cloud platforms such as GCP or AWS. • Solid knowledge of ETL frameworks, workflow orchestration (Airflow) and modern data stores (BigQuery, PostgreSQL). • Comfortable working with Docker and API frameworks like FastAPI. • Clear, fluent communication in English.

Apply Now

Similar Jobs

🕒 June 25

Sezzle

201 - 500

💳 Fintech

👥 B2C

🛍️ eCommerce

Senior Security Infrastructure Engineer at Sezzle enhancing security across cloud infrastructure and applications. Collaborating with teams to improve security practices and incident response.

AWS

Cloud

Kubernetes

Linux

Splunk

🕒 June 12

Webflow

501 - 1000

☁️ SaaS

🌐 Web 3

🛍️ eCommerce

Senior Infrastructure Engineer for Webflow, focusing on enhancing cloud infrastructure and reliability for millions of users. Involving AWS & GCP operations, design, and collaboration across teams.

AWS

Cloud

Distributed Systems

Google Cloud Platform

Kubernetes

Node.js

Terraform

🕒 June 11

DXS - Direct Expansion Solutions

51 - 200

⚡ Energy

🏢 Enterprise

Infrastructure Engineer handling day-to-day operations of VMware virtual machines at DataXstream. Contributing to infrastructure management and ensuring high availability across IT operations.

Ansible

Cloud

DNS

Linux

Python

VMware

🕒 May 28

Silver.dev

1 - 10

🎯 Recruiter

👥 HR Tech

🤝 B2B

Senior Infrastructure Engineer designing Agora's internal platform for reliable software deployment. Empowering product teams with reusable infrastructure components and scalable systems for efficient software delivery.

AWS

Cloud

Distributed Systems

Kubernetes

TypeScript