Search Remote Jobs

Senior Web Scraping Engineer

3 days ago

Apply Now
Logo of Infinity Constellation

Infinity Constellation

AI • B2B • Fintech

Infinity Constellation is the first AI Holding Company dedicated to helping elite founders build AI companies that can operate at unprecedented speeds. By leveraging advanced AI platforms, Infinity Constellation aims to reinvent traditional venture models, focusing on efficient structures that allow businesses to generate profits rapidly with minimal resources. The company supports its portfolio through various operational services, enabling founders to scale and thrive in the rapidly evolving AI landscape.

1 - 10 employees

Founded 2023

🤝 B2B

đź’ł Fintech

đź“‹ Description

• Design, implement, and maintain web scraping pipelines for a wide variety of websites and data sources. • Build scrapers using tools and frameworks such as Selenium, Playwright, BeautifulSoup, Scrapy (and similar libraries) with a focus on reliability, performance, and maintainability. • Create automated workflows for scraping and data processing: • Containerize scraping jobs (e.g., using Docker). • Deploy and orchestrate them in the cloud (e.g., AWS, GCP, Azure). • Configure scheduling (e.g., run daily/weekly/hourly) and dependency management. • Implement monitoring, alerting, and logging: • Capture detailed logs for each job run. • Track job statuses and failures. • Implement notifications/alerts when a scraper breaks or a website changes. • Handle anti-bot measures (proxies, captchas, rate limits) and design scrapers that are resilient to layout and structure changes. • Work closely with data engineering / product / ML teams to understand data requirements and ensure data quality. • Utilize LLMs (Large Language Models) to: • Parse and extract structured information from messy HTML or semi-structured content. • Increase robustness of scrapers to frequent UI/DOM changes. • Prototype new scraping / extraction strategies using LLM APIs. • Write clean, well-tested, and well-documented code, and contribute to best practices, code reviews, and tooling for the team. • Continuously improve the scraping platform, including performance optimizations, standardization, and reusability of components.

🎯 Requirements

• 3+ years of professional experience working with web scraping or data collection at scale. • Strong proficiency in Python and common scraping libraries/frameworks such as: Selenium, Playwright, BeautifulSoup, Scrapy (or similar). • Solid understanding of HTML, CSS, JavaScript, HTTP, and browser behavior. • Experience building automated, production-grade workflows: Orchestrators / schedulers (e.g., Airflow, Prefect, Dagster, or similar). • Building ETL/ELT pipelines and integrating with databases, data warehouses, or storage (e.g., PostgreSQL, BigQuery, S3, GCS). • Hands-on experience with cloud platforms (AWS, GCP, or Azure), including: Deploying and running scheduled jobs. • Managing infrastructure-as-code or similar deployment processes. • Strong experience with logging, monitoring, and alerting: Ability to design logging for scraping jobs and to debug failures from logs. • Familiarity with tools like CloudWatch, Stackdriver, ELK, Prometheus, Grafana, or similar. • Experience with containers (Docker) and familiarity with CI/CD workflows. • Exposure to LLMs (e.g., OpenAI, Anthropic, etc.) for tasks like parsing, information extraction, or automation. • Strong problem-solving skills and the ability to debug complex, dynamic websites. • Comfortable working in a fast-paced environment, with good communication skills in English.

🏖️ Benefits

• Fully remote, flexible hours • Work on a global team, with real-world challenges • Payment in USD (contractor/freelance basis)

Apply Now

Similar Jobs

3 days ago

GuidePoint Security

201 - 500

đź”’ Cybersecurity

Senior Recovery and Restoration Engineer at GuidePoint Security. Rebuilding infrastructure securely after ransomware and cyber incidents, working with top organizations in the U.S.

🇺🇸 United States – Remote

⏰ Full Time

đźź  Senior

👷🏻‍♀️ Engineer

🦅 H1B Visa Sponsor

DNS

Python

VMware

3 days ago

Ben Aris

1 - 10

⚡ Energy

🎯 Recruiter

🤝 B2B

Corporate Process Controls Engineer role focusing on automation and control systems across U.S. chemical manufacturing sites. Collaborating with teams to enhance system performance and safety compliance.

🇺🇸 United States – Remote

⏰ Full Time

🟡 Mid-level

đźź  Senior

👷🏻‍♀️ Engineer

3 days ago

Palo Alto Networks

10,000+ employees

đź”’ Cybersecurity

🏢 Enterprise

Senior Manufacturing Engineer responsible for the lifecycle of hardware products at Palo Alto Networks. Collaborating with teams to ensure quality and efficiency from design through manufacturing.

🇺🇸 United States – Remote

đź’µ $106k - $145k / year

đź’° $10M Series C on 2008-11

⏰ Full Time

đźź  Senior

👷🏻‍♀️ Engineer

🦅 H1B Visa Sponsor

3 days ago

KBR, Inc.

10,000+ employees

🏛️ Government

Technical Professional leading planning, coordination, and execution of RFID installation projects for KBR's engineering solutions team. Ensure high standards of quality and customer satisfaction throughout project lifecycle.

🇺🇸 United States – Remote

⏰ Full Time

🟡 Mid-level

đźź  Senior

👷🏻‍♀️ Engineer

4 days ago

Machina

1 - 10

🚀 Aerospace

đź”§ Hardware

🤝 B2B

Consulting Build Engineer optimizing client build systems using NativeLink technology at Trace Machina. Collaborating with clients for enhanced performance, scalability, and reliability.

🇺🇸 United States – Remote

⏰ Full Time

🟡 Mid-level

đźź  Senior

👷🏻‍♀️ Engineer

Developed by Lior Neu-ner. I'd love to hear your feedback — Get in touch via DM or support@remoterocketship.com