Senior Data Scientist, NLP

Job not on LinkedIn

November 21

Apply Now
Logo of Clarivate

Clarivate

Education • Healthcare Insurance • Intellectual Property

Clarivate is a leading global provider of transformative intelligence and comprehensive solutions in the areas of Academia & Government, Intellectual Property, and Life Sciences & Healthcare. The company offers a range of enriched data, insights, analytics, workflow software, and expert services to support research institutions, library services, and intellectual property management. Clarivate's well-known products and services include Web of Science, ProQuest, Ex Libris, and Cortellis, which help drive research excellence, support IP management, and advance innovation in healthcare. Clarivate aims to empower academic institutions, government organizations, corporations, and law firms with the data and insights needed to make informed decisions and foster innovation.

10,000+ employees

Founded 2016

📚 Education

⚕️ Healthcare Insurance

📋 Description

• Design NLP Workflows: Develop scalable pipelines for text ingestion, cleaning, normalization, and tokenization to support downstream applications. • Implement Indexing and Vectorization Strategies: Architect and maintain robust indexing systems and vector databases for semantic search and retrieval. • Develop Prompting and Finetuning Frameworks: Create reusable prompting strategies and lead fine-tuning initiatives for LLMs tailored to business-specific tasks. • Build LangChain/LangGraph Applications: Construct dynamic knowledge systems and agentic workflows using LangChain and LangGraph. • Integrate Advanced RAG Architectures: Apply VRAG and GraphRAG design patterns to enrich information retrieval and contextual understanding. • Conduct Performance Optimization: Perform benchmark testing and model evaluations to improve accuracy, efficiency, and scalability of NLP systems. • Collaborate Across Teams: Work closely with engineering, product, and research stakeholders to deliver integrated AI-driven features. • Provide Technical Leadership: Mentor junior data scientists, guide best practices, and drive innovation across AI projects.

🎯 Requirements

• Bachelor’s degree in Computer Science, Data Science, Computational Linguistics, or a related field • At least 5 years of hands-on experience in data science, focused on natural language processing (NLP) • At least 5 years of experience using Python, with expertise in NLP libraries such as LangChain, LangGraph, or other “Lang”-based toolkits • Proven experience in model development and applying machine learning techniques to real-world problems • Expertise in retrieval-based LLM workflows (RAG, VRAG, GraphRAG) is a plus • Deep understanding of embedding models, semantic search, and vector stores (e.g., FAISS, Pinecone) • Experience with document loaders and text splitters/document splitting strategies • Familiarity with MLOps practices and production-level deployment of AI pipelines • Experience with cloud platforms (e.g., AWS, Azure, or GCP) • Experience applying Graph Neural Networks (GNNs) to retrieval-enhanced generation • Knowledge of LangSmith and vector orchestration platforms • Familiarity with multilingual NLP and cross-lingual embeddings • Exposure to real-time knowledge graphs and stream-based RAG systems • A Master’s or PhD in a technical field (Computer Science, Data Science, etc.)

🏖️ Benefits

• medical • dental • prescription drug • life insurance • 401k with match • long term disability coverage • vacation • sick time • volunteer time • discount programs

Apply Now

Similar Jobs

November 21

Senior Data Scientist building data products for Airbnb's marketplace, optimizing pricing and supply management through data science techniques.

November 21

Recruiting Systems and Analytics Lead managing recruiting systems and data flows for Mercury. Creating dashboards and insights to support decision-making for recruiting and finance teams.

November 21

Lead Analytics role focusing on converting healthcare data into economic insights. Collaborate across teams to drive smarter spending and better outcomes for Twin Health.

November 21

Senior Vice President, Data Science & Artificial Intelligence at CareMetx, leading data-driven healthcare initiatives to improve patient services. Building AI capabilities and enhancing patient support offerings.

November 21

Application Development and Database Manager supporting CVS Health's Regulatory Review Team by designing and maintaining QuickBase applications. Enhancing departmental efficiency and collaborating across teams.

Built by Lior Neu-ner. I'd love to hear your feedback — Get in touch via DM or support@remoterocketship.com