Software Engineer – Model Evaluation, Benchmarking

Job not on LinkedIn

🕒 February 25

🏢🏡 San Francisco – Hybrid

⏰ Full Time

🟡 Mid-level

🟠 Senior

🧑‍💻 Full-stack Engineer

Apply Now
Find Similar Remote Jobs

📊 Check your resume score for this job

Improve your chances of getting an interview by checking your resume score before you apply.

Logo of SPREEAI

SPREEAI

WebsiteLinkedIn

11 - 50 employees

👗 Fashion

🤖 Artificial Intelligence

🛍️ eCommerce

💰 Funding Round - SpreeAI on 2025-05

Fashion • Artificial Intelligence • eCommerce

SPREEAI is an AI-driven fashion technology company that provides real-time virtual try-on experiences for apparel, embedded directly into a brand's shopping platform. Customers can snap or upload a photo or select a model to see garments styled and fitted instantly, helping brands increase personalization and conversion. SPREEAI partners with fashion labels, industry organizations, and academic institutions to advance its fit and visualization technology within e-commerce.

📋 Description

• Build automated evaluation pipelines for multimodal AI models. • Benchmark diffusion models, vision systems, and generative workflows. • Validate model checkpoints and detect regressions across versions. • Develop evaluation metrics for realism, consistency, and performance. • Integrate evaluation tooling into CI/CD workflows. • Collaborate with ML researchers and infrastructure teams to ensure production readiness. • Analyze failure modes and propose evaluation strategies.

🎯 Requirements

• Degree in Computer Science, AI, Engineering, or comparable combination of education and practical experience. • Strong programming skills in Python. • Familiarity with object-oriented programming (C++, Java, Python, or similar). • Strong data structures and algorithms fundamentals. • Understanding of machine learning experimentation workflows. • Experience evaluating vision or generative models. • Familiarity with HuggingFace ecosystem or open-source ML toolkits. • Experience building automated test frameworks or benchmarking tools. • Knowledge of diffusion models or multimodal architectures. • Experience with data analysis tools (NumPy, Pandas, visualization libraries).

🏖️ Benefits

• Health insurance • Professional development opportunities • Flexible work arrangements

Apply Now

Similar Jobs

🕒 February 25

Midi Health

51 - 200

WebsiteLinkedIn

Senior Fullstack Engineer developing patient experience products at Midi Health. Collaborating with cross-functional teams to scale healthcare solutions and influence engineering culture.

🏢🏡 San Francisco – Hybrid

💵 $170k - $220k / year

⏰ Full Time

🟠 Senior

🧑‍💻 Full-stack Engineer

Django

JavaScript

Next.js

React

🕒 February 24

Baseten

11 - 50

🤖 Artificial Intelligence

☁️ SaaS

🏢 Enterprise

WebsiteLinkedIn

Software Engineer for GPU Networking at Baseten, building the infrastructure for AI inference optimization. Enhancing performance by integrating advanced networking protocols and optimizing distributed systems.

🏢🏡 San Francisco – Hybrid

💵 $150k - $250k / year

💰 $8M Seed Round on 2022-04

⏰ Full Time

🟡 Mid-level

🟠 Senior

🧑‍💻 Full-stack Engineer

🦅 H1B Visa Sponsor

info

Python

🕒 February 23

OpenAI

201 - 500

🤖 Artificial Intelligence

☁️ SaaS

🏢 Enterprise

WebsiteLinkedIn

Software Engineer specializing in infrastructure systems for ChatGPT. Develop core abstractions and tooling to support engineering teams in fast iterations.

Distributed Systems

🕒 February 19

Vercel

201 - 500

☁️ SaaS

🌐 Web 3

WebsiteLinkedIn

Software Engineer responsible for developing the Vercel Dashboard for user interaction and experience optimization. Working across the stack to build personalized, agent-powered surfaces for users.

🏢🏡 San Francisco – Hybrid

💵 $196k - $294k / year

💰 $150M Series D on 2021-11

⏰ Full Time

🟡 Mid-level

🟠 Senior

🧑‍💻 Full-stack Engineer

🦅 H1B Visa Sponsor

info

JavaScript

Next.js

React

TypeScript

🕒 February 19

Aquabyte

11 - 50

🌾 Agriculture

🤖 Artificial Intelligence

WebsiteLinkedIn

Senior Backend Engineer developing systems for real-time video streaming, AI analysis, and industrial machinery control. Collaborating on cloud and edge systems focusing on reliability and security.

🏢🏡 San Francisco – Hybrid

💵 $140k - $170k / year

⏰ Full Time

🟠 Senior

🧑‍💻 Full-stack Engineer

AWS

Cloud

Distributed Systems

Docker

FFmpeg

IoT

Python

Go