Senior Software Engineer, AI Eval

Job not on LinkedIn

🕒 January 10

Apply Now
Find Similar Remote Jobs

📊 Check your resume score for this job

Improve your chances of getting an interview by checking your resume score before you apply.

Logo of Sentry

Sentry

WebsiteLinkedIn

201 - 500 employees

Founded 2011

☁️ SaaS

🏢 Enterprise

SaaS • Developer Tools • Enterprise

Sentry is an application monitoring and error-tracking platform that helps developers identify, debug, and resolve software errors and performance issues. The company provides tools such as error monitoring, tracing, session replay, profiling, uptime monitoring, logs, and SDKs across many languages and frameworks. Sentry integrates with developer workflows and platforms (GitHub, Slack, Jira, Vercel, Netlify) and emphasizes developer productivity, privacy controls, and enterprise security and compliance.

📋 Description

• Design and build robust evaluation frameworks to measure accuracy, reliability, regressions, and edge cases in AI systems • Create and curate high-quality datasets, golden test cases, and benchmarks grounded in real production data • Build automated test harnesses and metrics pipelines to continuously evaluate models, prompts, and agentic workflows • Partner closely with applied AI engineers and product leaders to define what “good” looks like and translate it into measurable criteria • Own the evaluation lifecycle for major AI initiatives, from early experimentation through production monitoring

🎯 Requirements

• Minimum 5+ years of professional experience with a Bachelor’s degree in computer science, machine learning, or a related field • Experience building testing, evaluation, or data infrastructure for complex systems (AI/ML experience strongly preferred) • Comfort writing production-quality code (we use Python and TypeScript) • Experience working with structured and unstructured datasets, labeling workflows, or data quality pipelines • Familiarity with modern ML systems and evaluation techniques (e.g., offline metrics, online evaluation, regression testing for models or prompts) • Bonus: experience evaluating LLMs, agentic systems, or AI-assisted developer tools

🏖️ Benefits

• A successful candidate will be eligible to participate in Sentry’s employee benefit plans/programs applicable to the candidate’s position (including incentive compensation, equity grants, paid time off, and group health insurance coverage).

Apply Now

Similar Jobs

🕒 January 9

Rocket Money (formerly Truebill)

51 - 200

💸 Finance

💳 Fintech

👥 B2C

WebsiteLinkedIn

Senior Software Engineer at Rocket Money developing software solutions for homeownership. Collaborating with internal teams and building impactful technology for clients.

🏢🏡 San Francisco – Hybrid

💵 $150k - $185k / year

⏰ Full Time

🟠 Senior

🧑‍💻 Full-stack Engineer

Postgres

Python

React

🕒 January 8

Unify

11 - 50

🤝 B2B

🤖 Artificial Intelligence

☁️ SaaS

WebsiteLinkedIn

AI Engineer at Unify leveraging AI to build innovative workflows and next-generation GTM solutions. Collaborating with teams and creating novel AI applications.

🕒 January 5

Parafin

51 - 200

💳 Fintech

💸 Finance

🤝 B2B

WebsiteLinkedIn

Senior Software Engineer for Parafin’s Infrastructure team, leading the evolution of the ML Platform and ensuring scalable, reliable systems for small business funding.

Airflow

AWS

PySpark

Python

Spark

SQL

🕒 December 18, 2025

Trunk

11 - 50

☁️ SaaS

⚡ Productivity

🏢 Enterprise

WebsiteLinkedIn

Tech Lead responsible for CI Reliability Platform at Trunk. Creating data infrastructure to enhance flaky test detection and CI analytics products.

🏢🏡 San Francisco – Hybrid

💵 $200k - $245k / year

⏰ Full Time

🟠 Senior

🧑‍💻 Full-stack Engineer

AWS

Distributed Systems

Kubernetes

Postgres

Python

Rust

TypeScript

🕒 December 18, 2025

AngelList

51 - 200

💸 Finance

💳 Fintech

☁️ SaaS

WebsiteLinkedIn

Senior Product Engineer developing tools for venture capital firms at AngelList. Building foundational products to enhance investor management at the company headquartered in San Francisco.

🏢🏡 San Francisco – Hybrid

💵 $200k / year

💰 $44M Series B on 2022-04

⏰ Full Time

🟠 Senior

🧑‍💻 Full-stack Engineer

GraphQL

Node.js

React

TypeScript