Web Scraping Engineer II

🕒 Fevereiro 13

🇮🇳 Índia – Remoto

⏰ Tempo Integral

🟡 Pleno

🟠 Sênior

👷🏻‍♀️ Engenheiro

🗣️🇺🇸🇬🇧 Inglês obrigatório

Puppeteer

Selenium

Candidatar-se
Encontrar Vagas Remotas Similares

📊 Verifique sua pontuação de currículo para esta vaga

Melhore suas chances de conseguir uma entrevista verificando sua pontuação de currículo antes de se candidatar.

Logo of YipitData

YipitData

201 - 500 funcionários

💸 Finanças

🏢 Corporativo

Finance • Enterprise • Data Analysis

YipitData é uma empresa que se especializa em fornecer insights precisos e oportunos sobre mais de 1. 000 empresas, analisando bilhões de pontos de dados todos os dias. Ela oferece pesquisas detalhadas para ajudar investidores a tomar decisões mais inteligentes e auxilia empresas a aumentar a participação de mercado, as vendas e a base de clientes. A YipitData entrega dados alternativos por meio de uma variedade de datasets, incluindo dados de recibos, dados de cartão, dados da web e resultados financeiros divulgados publicamente. Esses dados são usados para acompanhar tendências de mercado, comportamento do consumidor e categorias de produtos, permitindo às empresas obter maior transparência sobre métricas de desempenho. Seus serviços atendem a investidores, empresas e parceiros de dados, com forte foco em precisão e na entrega de dados quase em tempo real.

Descrição

• Refactor and Maintain Web Scrapers • Overhaul existing scraping scripts to improve reliability, maintainability, and efficiency. • Implement best coding practices (clean code, modular architecture, code reviews, etc.) to ensure quality and sustainability. • Implement Advanced Scraping Techniques • Utilize sophisticated fingerprinting methods (cookies, headers, user-agent rotation, proxies) to avoid detection and blocking. • Handle dynamic content, navigate complex DOM structures, and manage session/cookie lifecycles effectively. • Collaborate with Cross-Functional Teams • Work closely with analysts and other stakeholders to gather requirements, align on targets, and ensure data quality. • Provide support, documentation, and best practices to internal stakeholders to ensure effective use of our web scraped data in critical reporting workflows. • Monitor and Troubleshoot • Develop robust monitoring solutions, alerting frameworks to quickly identify and address failures. • Continuously evaluate scraper performance, proactively diagnosing bottlenecks and scaling issues. • Drive Continuous Improvement • Propose new tooling, methodologies, and technologies to enhance our scraping capabilities and processes. • Stay up to date with industry trends, evolving bot-detection tactics, and novel approaches to web data extraction.

🎯 Requisitos

• 3+ years of experience with web scraping frameworks (e.g., Selenium, Playwright, or Puppeteer). • Strong understanding of HTTP, RESTful APIs, HTML parsing, browser rendering, and TLS/SSL mechanics. • Expertise in advanced fingerprinting and evasion strategies (e.g., browser fingerprint spoofing, request signature manipulation). • Deep experience managing cookies, headers, session states, and proxy rotations, including the deployment of both residential and data center proxies. • Experience with logging, metrics, and alerting to ensure high availability. • Troubleshooting skills to optimize scraper performance for efficiency, reliability, and scalability.

🏖️ Benefícios

• We care about your personal life and we mean it. We offer vacation time, parental leave, team events, learning reimbursement, and more! • Your growth at YipitData is determined by the impact that you are making, not by tenure, unnecessary facetime, or office politics. Everyone at YipitData is empowered to learn, self-improve, and master their skills in an environment focused on ownership, respect, and trust.

Candidatar-se

Vagas Similares

🕒 Fevereiro 12

Miratech

501 - 1000

Amazon Connect Engineer responsible for designing, building, and optimizing contact center solutions on AWS. Work involves coordination, technical support, and collaboration across teams.

🇮🇳 Índia – Remoto

💰 Private Equity Round em 2022-04

⏰ Tempo Integral

🟡 Pleno

🟠 Sênior

👷🏻‍♀️ Engenheiro

🗣️🇺🇸🇬🇧 Inglês obrigatório

🕒 Fevereiro 12

Miratech

501 - 1000

Amazon Connect Engineer designing and optimizing contact center solutions on AWS. Collaborate with teams for building and supporting advanced customer engagement.

🇮🇳 Índia – Remoto

💰 Private Equity Round em 2022-04

⏰ Tempo Integral

🟡 Pleno

🟠 Sênior

👷🏻‍♀️ Engenheiro

🗣️🇺🇸🇬🇧 Inglês obrigatório

🕒 Fevereiro 11

Miratech

501 - 1000

Amazon Connect Engineer designing and optimizing contact center solutions on AWS for a global IT services company. Responsibilities include building call flow designs and providing technical support for contact center technology.

🇮🇳 Índia – Remoto

💰 Private Equity Round em 2022-04

⏰ Tempo Integral

🟡 Pleno

🟠 Sênior

👷🏻‍♀️ Engenheiro

🗣️🇺🇸🇬🇧 Inglês obrigatório

🕒 Janeiro 30

Netomi

51 - 200

🤖 Inteligência Artificial

🏢 Corporativo

☁️ SaaS

Documentation Engineer designing and improving documentation infrastructure for Netomi's AI platform. Collaborating with writers and engineers to modernize documentation tools and workflows.

🇮🇳 Índia – Remoto

💰 $30.000.000 Series B em 2021-11

⏰ Tempo Integral

🟠 Sênior

👷🏻‍♀️ Engenheiro

🗣️🇺🇸🇬🇧 Inglês obrigatório

🕒 Janeiro 22

Toku

51 - 200

📡 Telecomunicações

☁️ SaaS

Services Engineer implementing and deploying cloud communication and customer engagement solutions at Toku. Focused on optimizing voice and contact center platforms with a strong troubleshooting capability.

🇮🇳 Índia – Remoto

💰 $5.000.000 Series A em 2022-10

⏰ Tempo Integral

🟡 Pleno

🟠 Sênior

👷🏻‍♀️ Engenheiro

🗣️🇺🇸🇬🇧 Inglês obrigatório