Senior Software Engineer, AI Eval

🕒 January 10

🏢🏡 San Francisco – Hybrid

💵 $240k - $280k / year

⏰ Full Time

🟠 Senior

🧑‍💻 Full-stack Engineer

🦅 H1B Visa Sponsor

info
Apply Now
Find Similar Remote Jobs

📊 Check your resume score for this job

Improve your chances of getting an interview by checking your resume score before you apply.

Logo of Sentry

Sentry

WebsiteLinkedIn

201 - 500 employees

Founded 2011

☁️ SaaS

🏢 Enterprise

SaaS • Developer Tools • Enterprise

Sentry is an application monitoring and error-tracking platform that helps developers identify, debug, and resolve software errors and performance issues. The company provides tools such as error monitoring, tracing, session replay, profiling, uptime monitoring, logs, and SDKs across many languages and frameworks. Sentry integrates with developer workflows and platforms (GitHub, Slack, Jira, Vercel, Netlify) and emphasizes developer productivity, privacy controls, and enterprise security and compliance.

📋 Description

• Design and build robust evaluation frameworks to measure accuracy, reliability, regressions, and edge cases in AI systems • Create and curate high-quality datasets, golden test cases, and benchmarks grounded in real production data • Build automated test harnesses and metrics pipelines to continuously evaluate models, prompts, and agentic workflows • Partner closely with applied AI engineers and product leaders to define what “good” looks like and translate it into measurable criteria • Own the evaluation lifecycle for major AI initiatives, from early experimentation through production monitoring

🎯 Requirements

• Minimum 5+ years of professional experience with a Bachelor’s degree in computer science, machine learning, or a related field • Experience building testing, evaluation, or data infrastructure for complex systems (AI/ML experience strongly preferred) • Comfort writing production-quality code (we use Python and TypeScript) • Experience working with structured and unstructured datasets, labeling workflows, or data quality pipelines • Familiarity with modern ML systems and evaluation techniques (e.g., offline metrics, online evaluation, regression testing for models or prompts) • Bonus: experience evaluating LLMs, agentic systems, or AI-assisted developer tools

🏖️ Benefits

• A successful candidate will be eligible to participate in Sentry’s employee benefit plans/programs applicable to the candidate’s position (including incentive compensation, equity grants, paid time off, and group health insurance coverage).

Apply Now

Similar Jobs

🕒 January 9

Rocket Money (formerly Truebill)

51 - 200

💸 Finance

💳 Fintech

👥 B2C

WebsiteLinkedIn

Senior Software Engineer at Rocket Money developing software solutions for homeownership. Collaborating with internal teams and building impactful technology for clients.

🏢🏡 San Francisco – Hybrid

💵 $150k - $185k / year

⏰ Full Time

🟠 Senior

🧑‍💻 Full-stack Engineer

🕒 January 8

Unify

11 - 50

🤝 B2B

🤖 Artificial Intelligence

☁️ SaaS

WebsiteLinkedIn

AI Engineer at Unify leveraging AI to build innovative workflows and next-generation GTM solutions. Collaborating with teams and creating novel AI applications.

🕒 January 8

Flux

11 - 50

WebsiteLinkedIn

Senior Software Engineer responsible for building AI systems and workflows for innovative hardware design software. Collaborating with engineers to enhance user experience in developing manufacturable electronics.

🕒 January 5

Parafin

51 - 200

💳 Fintech

💸 Finance

🤝 B2B

WebsiteLinkedIn

Senior Software Engineer for Parafin’s Infrastructure team, leading the evolution of the ML Platform and ensuring scalable, reliable systems for small business funding.

🕒 December 18, 2025

Trunk

11 - 50

☁️ SaaS

⚡ Productivity

🏢 Enterprise

WebsiteLinkedIn

Tech Lead responsible for CI Reliability Platform at Trunk. Creating data infrastructure to enhance flaky test detection and CI analytics products.

🏢🏡 San Francisco – Hybrid

💵 $200k - $245k / year

⏰ Full Time

🟠 Senior

🧑‍💻 Full-stack Engineer