AI Engineer – Agent Evaluation Platform

Job not on LinkedIn

November 9

Apply Now
Logo of Hyperskill

Hyperskill

Education • Artificial Intelligence • SaaS

Hyperskill is reimagining education for the AI era, building intelligent, AI-driven learning systems and productivity tools that adapt to individual learners. It offers project-based courses, career paths, bootcamps, and team training focused on programming (Python, Java, Kotlin, SQL, Go, C++), AI engineering, and product-minded development, plus AI-native tools like Enlighter, Rolloo, InMind Lab, and StoryFlow to help professionals and organizations upskill and deploy practical AI solutions.

11 - 50 employees

📚 Education

🤖 Artificial Intelligence

☁️ SaaS

📋 Description

• build the technical infrastructure for comprehensive agent evaluation • creating systems that can automatically test agent performance • building tools for managing evaluation datasets • implementing both deterministic tests and non-deterministic evaluation • making evaluation systems that can handle enterprise workloads and provide reliable insights about agent performance

🎯 Requirements

• deep AI engineering experience — you've built AI systems, deployed them in production, and dealt with the challenge of measuring their real-world performance • understand evaluation platforms — you've worked with tools like Langfuse and know the current limitations of AI testing • built evaluation systems — you've created tools that measure AI system quality and can distinguish between technical functionality and user value • thrive in uncertainty — you'll need to build a lot, figure things out on the go, experiment constantly, and handle multiple different tasks across various areas simultaneously.

🏖️ Benefits

• Contractor agreement with a US-registered legal entity. • 100% remote — work from anywhere in the world • Competitive salary in USD + options in the product you're working on — we focus on market rates, ready to hear your expectations and prepare an offer matching your expertise • Resources — budget for tools, learning, and whatever you need to succeed • Fast-moving environment — we ship fast, learn fast, and iterate based on real customer feedback

Apply Now

Similar Jobs

August 30

She Is Bold.

1 - 10

🌍 Social Impact

📚 Education

🤖 Artificial Intelligence

AI Engineer at SOULCHI builds LLM-powered conversational and agentic AI. Designs multi-agent workflows on Vertex AI for wellness apps.

🌏 Anywhere in the World

⏰ Full Time

🟡 Mid-level

🟠 Senior

🤖 AI Engineer

August 20

Quora

201 - 500

📱 Media

📚 Education

🤖 Artificial Intelligence

Senior AI Engineer for Poe at Quora; build Poe’s bot ecosystem with state-of-the-art AI models. Lead end-to-end ML systems and collaborate with product teams.

🌏 Anywhere in the World

💵 $165.6k - $252.4k / year

💰 $85M Series D on 2017-04

⏰ Full Time

🟠 Senior

🤖 AI Engineer

Developed by Lior Neu-ner. I'd love to hear your feedback — Get in touch via DM or support@remoterocketship.com