AI QA Trainer – LLM Evaluation

Job not on LinkedIn

🕒 March 20

🌏 Anywhere in the World

💵 $6 - $65 / hour

⏳ Contract/Temporary

🟡 Mid-level

🟠 Senior

🔧 QA Engineer (Quality Assurance)

Apply Now
Find Similar Remote Jobs

📊 Check your resume score for this job

Improve your chances of getting an interview by checking your resume score before you apply.

Logo of Invisible Technologies

Invisible Technologies

201 - 500 employees

Founded 2015

🤖 Artificial Intelligence

☁️ SaaS

🏢 Enterprise

🔥 Funding within the last year

💰 $100M Series unknown on 2025-10

Artificial Intelligence • SaaS • Enterprise

Invisible Technologies is an enterprise AI platform and services company that builds and deploys production-grade AI systems for large organizations. They combine a modular SaaS platform (data platform, process builder, agents, evaluations) with an expert human marketplace to train models, automate complex back-office workflows, power contact centers, provide computer vision and demand-forecasting solutions, and ensure ongoing evaluation and governance. Invisible works across sectors (finance, healthcare, public sector, sports, retail) to integrate AI into real operational systems and scale outcomes.

📋 Description

• Converse with the model on real-world scenarios and evaluation prompts • Verify factual accuracy and logical soundness • Design and run test plans and regression suites • Build clear rubrics and pass/fail criteria • Capture reproducible error traces with root-cause hypotheses • Suggest improvements to prompt engineering, guardrails, and evaluation metrics (e.g., precision/recall, faithfulness, toxicity, and latency SLOs) • Partner on adversarial red-teaming, automation (Python/SQL), and dashboarding to track quality deltas over time

🎯 Requirements

• Bachelor’s, master’s, or PhD in computer science, data science, computational linguistics, statistics, or a related field is ideal • Shipped QA for ML/AI systems • Safety/red-team experience • Test automation frameworks (e.g., PyTest) • Hands-on work with LLM eval tooling (e.g., OpenAI Evals, RAG evaluators, W&B) • Skills that stand out include: evaluation rubric design, adversarial testing/red-teaming, regression testing at scale, bias/fairness auditing, grounding verification, prompt and system-prompt engineering, test automation (Python/SQL), and high-signal bug reporting • Clear, metacognitive communication—“showing your work”—is essential.

🏖️ Benefits

• Company-sponsored benefits such as health insurance do not apply • You’ll supply a secure computer and high-speed internet

Apply Now

Similar Jobs

🕒 July 30, 2025

Testlio

201 - 500

☁️ SaaS

🤝 B2B

⚡ Productivity

Future roles as a Quality Engineer focusing on mobile and API test automation at Testlio.

🌏 Anywhere in the World

💰 $12M Series B on 2021-10

⏳ Contract/Temporary

🟡 Mid-level

🟠 Senior

🔧 QA Engineer (Quality Assurance)