Machine Learning Engineer – Document Intelligence, Applied GenAI

2 days ago

Apply Now
Logo of PandaDoc

PandaDoc

SaaS • B2B • Productivity

PandaDoc is a comprehensive document management solution that helps businesses streamline their document workflows. It offers a range of features including custom agreement generation, eSignatures, CPQ (configure, price, quote) capabilities, and real-time collaboration tools. PandaDoc is designed for ease of use, enabling teams to automate document creation and management processes, thus improving efficiency and reducing errors. The platform integrates with popular CRM systems, payment gateways, and other tools to facilitate seamless business operations. Focused on security and compliance, PandaDoc supports legal and secure electronic transactions, making it ideal for businesses looking to optimize their agreement management processes.

501 - 1000 employees

Founded 2011

☁️ SaaS

🤝 B2B

⚡ Productivity

💰 Series C on 2021-09

📋 Description

• Build and maintain evaluation frameworks for document models, LLMs, OCR, and structured extraction. • Define metrics, benchmarks, and validation strategies for real-world document workloads. • Design and curate high-quality datasets for supervised training, fine-tuning, and validation. • Create scalable preprocessing pipelines for PDFs, scans, images, forms, and semi-structured documents. • Train and fine-tune transformer-based OCR, VLMs, layout models, and open-source LLMs for document understanding tasks. • Optimize models for reliability, accuracy, and cost efficiency in production environments. • Deploy ML models with modern inference runtimes (vLLM, TGI, TensorRT, ONNX Runtime). • Build guardrails, monitoring, and fallback mechanisms to ensure safe and predictable model behavior. • Develop retrieval and chunking strategies tailored to document structures (tables, forms, multi-page PDFs). • Optimize end-to-end RAG pipelines for semantic search, Q&A, and workflow automation. • Partner with PMs, backend engineers, and product designers to define AI opportunities and translate requirements into technical solutions.

🎯 Requirements

• 5+ years of Python experience • Experience training, fine-tuning, and deploying traditional computer vision models for document intelligence tasks (layout detection, table extraction, OCR, information extraction) • Hands-on experience with document understanding frameworks and models: • Traditional document AI models (LayoutLM, Donut, DocFormer) • Modern vision-language models with OCR capabilities (DeepSeek-OCR, LightOnOCR-1B, etc.) • Experience deploying and optimizing models using inference frameworks such as vLLM (preferred), TGI, TensorRT, or ONNX Runtime • Experience applying LLMs to document intelligence workflows, including both frontier models and open-source alternatives • Strong understanding of coordinate systems and spatial reasoning for absolute positioning and field detection in forms/documents.

🏖️ Benefits

• An honest, open culture that emphasizes feedback and promotes professional and personal development • An opportunity to work from anywhere — our team is distributed worldwide, from Lisbon to Manila, from Florida to California • 6 self care days • A competitive salary • And much more!

Apply Now

Similar Jobs

4 days ago

Zyte

201 - 500

🔌 API

🤖 Artificial Intelligence

🛍️ eCommerce

Team Lead managing Core & MLOps Squad at Zyte enabling scalable data infrastructure. Overseeing MLOps excellence and technical leadership for a distributed team.

🇵🇱 Poland – Remote

💰 $3M Debt Financing on 2021-12

⏰ Full Time

🟠 Senior

🤖 Machine Learning Engineer

November 25

intive

1001 - 5000

🤖 Artificial Intelligence

Senior Machine Learning Engineer at intive developing scalable ML architectures and collaborating with cross-functional teams. Exploring advanced ML techniques and mentoring team members.

🇵🇱 Poland – Remote

⏰ Full Time

🟠 Senior

🤖 Machine Learning Engineer

November 13

DocPlanner

1001 - 5000

⚕️ Healthcare Insurance

☁️ SaaS

👥 B2C

Senior Machine Learning Engineer leading ML initiatives for Noa product at Docplanner. Collaborating with cross-functional teams to design and deploy AI-driven solutions in healthcare.

🇵🇱 Poland – Remote

⏰ Full Time

🟠 Senior

🤖 Machine Learning Engineer

November 6

Tidio

51 - 200

🤝 B2B

☁️ SaaS

(Senior) Machine Learning Engineer developing NLP models for Tidio's AI customer service platform. Collaborating with a small team to push the boundaries of conversational AI solutions.

🇵🇱 Poland – Remote

💵 zł23k - zł33k / month

💰 $25M Series B on 2022-05

⏰ Full Time

🟠 Senior

🤖 Machine Learning Engineer

November 6

Zyte

201 - 500

🔌 API

🤖 Artificial Intelligence

🛍️ eCommerce

Team Lead managing Core & MLOps Squad at Zyte, a data extraction company. Leading cross-functional teams to design scalable infrastructure for MLOps and systems programming.

🇵🇱 Poland – Remote

💰 $3M Debt Financing on 2021-12

⏰ Full Time

🟠 Senior

🤖 Machine Learning Engineer

Developed by Lior Neu-ner. I'd love to hear your feedback — Get in touch via DM or support@remoterocketship.com