Web Scraping Engineer II

🕒 il y a 3 mois

🇮🇳 Inde – Télétravail

⏰ Temps Plein

🟡 Intermédiaire

🟠 Senior

👷🏻‍♀️ Ingénieur

🗣️🇺🇸🇬🇧 Anglais requis

Puppeteer

Selenium

Postuler Maintenant
Trouver des Emplois à Distance Similaires

📊 Vérifiez votre score de CV pour ce poste

Améliorez vos chances d'obtenir un entretien en vérifiant votre score de CV avant de postuler.

Logo of YipitData

YipitData

201 - 500 employés

💸 Finance

🏢 Entreprise

Finance • Enterprise • Data Analysis

YipitData est une entreprise spécialisée dans la fourniture d'informations précises et opportunes sur plus de 1 000 entreprises en analysant des milliards de points de données chaque jour. Elle offre des recherches détaillées pour aider les investisseurs à prendre des décisions plus intelligentes et assiste les entreprises à accroître leur part de marché, leurs ventes et leur base de clients. YipitData fournit des données alternatives via une gamme d'ensembles de données, incluant les données de reçus, les données de cartes, les données web et les bénéfices déclarés publiquement. Ces données sont utilisées pour suivre les tendances du marché, le comportement des consommateurs et les catégories de produits, permettant aux entreprises d'acquérir une transparence sur les indicateurs de performance. Leurs services s'adressent aux investisseurs, aux entreprises et aux partenaires de données, avec une forte focalisation sur la précision et la livraison de données en temps quasi réel.

Description

• Refactor and Maintain Web Scrapers • Overhaul existing scraping scripts to improve reliability, maintainability, and efficiency. • Implement best coding practices (clean code, modular architecture, code reviews, etc.) to ensure quality and sustainability. • Implement Advanced Scraping Techniques • Utilize sophisticated fingerprinting methods (cookies, headers, user-agent rotation, proxies) to avoid detection and blocking. • Handle dynamic content, navigate complex DOM structures, and manage session/cookie lifecycles effectively. • Collaborate with Cross-Functional Teams • Work closely with analysts and other stakeholders to gather requirements, align on targets, and ensure data quality. • Provide support, documentation, and best practices to internal stakeholders to ensure effective use of our web scraped data in critical reporting workflows. • Monitor and Troubleshoot • Develop robust monitoring solutions, alerting frameworks to quickly identify and address failures. • Continuously evaluate scraper performance, proactively diagnosing bottlenecks and scaling issues. • Drive Continuous Improvement • Propose new tooling, methodologies, and technologies to enhance our scraping capabilities and processes. • Stay up to date with industry trends, evolving bot-detection tactics, and novel approaches to web data extraction.

🎯 Exigences

• 3+ years of experience with web scraping frameworks (e.g., Selenium, Playwright, or Puppeteer). • Strong understanding of HTTP, RESTful APIs, HTML parsing, browser rendering, and TLS/SSL mechanics. • Expertise in advanced fingerprinting and evasion strategies (e.g., browser fingerprint spoofing, request signature manipulation). • Deep experience managing cookies, headers, session states, and proxy rotations, including the deployment of both residential and data center proxies. • Experience with logging, metrics, and alerting to ensure high availability. • Troubleshooting skills to optimize scraper performance for efficiency, reliability, and scalability.

🏖️ Avantages

• We care about your personal life and we mean it. We offer vacation time, parental leave, team events, learning reimbursement, and more! • Your growth at YipitData is determined by the impact that you are making, not by tenure, unnecessary facetime, or office politics. Everyone at YipitData is empowered to learn, self-improve, and master their skills in an environment focused on ownership, respect, and trust.

Postuler Maintenant

Emplois Similaires

🕒 il y a 3 mois

Miratech

501 - 1000

Amazon Connect Engineer responsible for designing, building, and optimizing contact center solutions on AWS. Work involves coordination, technical support, and collaboration across teams.

🇮🇳 Inde – Télétravail

💰 Private Equity Round en 2022-04

⏰ Temps Plein

🟡 Intermédiaire

🟠 Senior

👷🏻‍♀️ Ingénieur

🗣️🇺🇸🇬🇧 Anglais requis

🕒 il y a 3 mois

Miratech

501 - 1000

Amazon Connect Engineer designing and optimizing contact center solutions on AWS. Collaborate with teams for building and supporting advanced customer engagement.

🇮🇳 Inde – Télétravail

💰 Private Equity Round en 2022-04

⏰ Temps Plein

🟡 Intermédiaire

🟠 Senior

👷🏻‍♀️ Ingénieur

🗣️🇺🇸🇬🇧 Anglais requis

🕒 il y a 3 mois

Miratech

501 - 1000

Amazon Connect Engineer designing and optimizing contact center solutions on AWS for a global IT services company. Responsibilities include building call flow designs and providing technical support for contact center technology.

🇮🇳 Inde – Télétravail

💰 Private Equity Round en 2022-04

⏰ Temps Plein

🟡 Intermédiaire

🟠 Senior

👷🏻‍♀️ Ingénieur

🗣️🇺🇸🇬🇧 Anglais requis

🕒 il y a 4 mois

Netomi

51 - 200

🤖 Intelligence artificielle

🏢 Entreprise

☁️ SaaS

Documentation Engineer designing and improving documentation infrastructure for Netomi's AI platform. Collaborating with writers and engineers to modernize documentation tools and workflows.

🇮🇳 Inde – Télétravail

💰 €30 000 000 Series B en 2021-11

⏰ Temps Plein

🟠 Senior

👷🏻‍♀️ Ingénieur

🗣️🇺🇸🇬🇧 Anglais requis

🕒 il y a 4 mois

Toku

51 - 200

📡 Télécommunications

☁️ SaaS

Services Engineer implementing and deploying cloud communication and customer engagement solutions at Toku. Focused on optimizing voice and contact center platforms with a strong troubleshooting capability.

🇮🇳 Inde – Télétravail

💰 €5 000 000 Series A en 2022-10

⏰ Temps Plein

🟡 Intermédiaire

🟠 Senior

👷🏻‍♀️ Ingénieur

🗣️🇺🇸🇬🇧 Anglais requis