QA Engineer, AI Products

🕒 Maio 19

🇺🇸 Estados Unidos – Remoto (EUA)

⏰ Tempo Integral

🟡 Pleno

🟠 Sênior

🔧 Engenheiro de QA (Qualidade de Software)

🗣️🇺🇸🇬🇧 Inglês obrigatório

Candidatar-se
Encontrar Vagas Remotas Similares

📊 Verifique sua pontuação de currículo para esta vaga

Melhore suas chances de conseguir uma entrevista verificando sua pontuação de currículo antes de se candidatar.

Logo of MDCalc

MDCalc

11 - 50 funcionários

Fundada em 2011

⚕️ Seguro de Saúde

☁️ SaaS

📚 Educação

Healthcare Insurance • SaaS • Education

MDCalc é uma ferramenta amplamente utilizada que oferece um conjunto abrangente de calculadoras médicas para profissionais de saúde. Ela ajuda esses profissionais a tomar decisões embasadas ao disponibilizar cálculos para uma variedade de avaliações de saúde e estratégias de tratamento, abrangendo áreas como risco cardíaco, embolia pulmonar, fibrose hepática e muito mais. O MDCalc é usado por milhões de profissionais de saúde em todo o mundo para apoiar o cuidado de centenas de milhões de pacientes, assegurando que os cálculos sejam revisados e não utilizados como único guia para o cuidado do paciente. A plataforma também oferece recursos educacionais e integra-se a prontuários eletrônicos (EHR).

Descrição

• Design and execute test strategies for LLM-powered features, including prompt regression testing, output evaluation, and hallucination detection • Build and maintain automated evaluation pipelines (eval sets, golden datasets, LLM-as-judge frameworks) to catch quality regressions in non-deterministic outputs • Perform black-box and exploratory testing of MDCalc's AI features across web and mobile, with particular attention to clinical accuracy, safety, and edge cases • Define quality metrics for AI outputs (accuracy, faithfulness, relevance, safety, latency, cost) and establish thresholds for release readiness • Collaborate cross-functionally with engineers, product managers, ML/AI engineers, and clinical reviewers to define what "good" looks like for AI responses • Investigate and triage AI failure modes, distinguishing model issues, prompt issues, retrieval issues, and integration bugs • Participate in team discussions, offering feedback on testability, risks, prompt design, and guardrails • Help develop QA strategies to expand future testing capacity, automation, and evaluation coverage as the AI product surface grows

🎯 Requisitos

• 5+ years of experience in software QA, with at least 1 year of hands-on testing of LLM-based or AI/ML-powered features • Strong understanding of QA principles, test case creation/documentation, and best practices for both deterministic and non-deterministic systems • Hands-on experience with LLM tooling and concepts: prompt engineering, RAG systems, evaluation frameworks (e.g., Promptfoo, Braintrust, LangSmith, DeepEval, Ragas, OpenAI Evals), and LLM APIs (OpenAI, Anthropic, etc.) • Experience designing automated qualitative evaluation approaches, including LLM-as-judge, rubric-based scoring, semantic similarity checks, and golden dataset regression testing • Proficiency with test automation tools, with a focus on Playwright • Strong SQL skills for data validation, test data creation, and verifying data integrity across systems • Familiarity with token usage, latency profiling, and cost monitoring as quality signals • Eagerness to learn quickly and a positive, solutions-oriented attitude • Clear and concise communicator, able to surface issues, blockers, and risks effectively when communicating ambiguous or probabilistic failures • Self-motivated, proactive, and able to manage time and priorities independently

🏖️ Benefícios

• Ability to make a true difference in medicine: MDCalc is the most broadly used medical reference by physicians, used by over 65% of US attending doctors weekly • Medical, Dental, & Vision Coverage, with option to extend to your dependents • Company-sponsored short-term insurance • Fully-paid 8 week parental leave, after 6 months of employment • Company-sponsored 401k, after 3 months of employment • Unlimited vacation for salaried roles - we trust you to take the time you need • Bi-annual company offsites to connect, reflect, and plan together • Work from home monthly stipend • A culture of fun and motivated team members who believe in a greater mission here at MDCalc

Candidatar-se

Vagas Similares

🕒 Maio 19

The Hello Team

1001 - 5000

🤝 B2B

🎯 Recrutamento

QA Engineer responsible for testing and validating features for internal SaaS platform. Collaborating with development teams and managing the QA process for product stability.

🇺🇸 Estados Unidos – Remoto (EUA)

⏰ Tempo Integral

🟡 Pleno

🟠 Sênior

🔧 Engenheiro de QA (Qualidade de Software)

🗣️🇺🇸🇬🇧 Inglês obrigatório

🕒 Maio 19

Huron

5001 - 10000

🤝 B2B

🏢 Corporativo

💸 Finanças

Senior Director leading Revenue Cycle Learning and Quality Assurance for healthcare organizations at Huron. Responsible for training, development, and operational excellence in revenue cycle practices.

🗣️🇺🇸🇬🇧 Inglês obrigatório

🕒 Maio 19

NMS

1001 - 5000

🤝 B2B

🔐 Segurança

🏢 Corporativo

Hotel Housekeeping Manager overseeing housekeeping services at Qavartarvik Customer Lodge. Responsible for staff training, operational compliance, and quality service delivery.

🇺🇸 Estados Unidos – Remoto (EUA)

⏰ Tempo Integral

🟡 Pleno

🟠 Sênior

🔧 Engenheiro de QA (Qualidade de Software)

🗣️🇺🇸🇬🇧 Inglês obrigatório

🕒 Maio 18

Coalfire

1001 - 5000

🔒 Cibersegurança

📋 Conformidade

🏢 Corporativo

Consultant for cybersecurity required to conduct penetration testing. Join a team to identify vulnerabilities and strengthen clients' security postures.

🗣️🇺🇸🇬🇧 Inglês obrigatório

🕒 Maio 18

Abacus Insights

51 - 200

⚕️ Seguro de Saúde

☁️ SaaS

Senior AI Systems Quality Engineer ensuring reliability and trustworthiness of AI systems for healthcare decision-making. Building automated frameworks and validation protocols in a production environment.

🗣️🇺🇸🇬🇧 Inglês obrigatório