Scientific AI Evaluation, Computational Problem Designer

Job not on LinkedIn

🕒 May 5

Apply Now
Find Similar Remote Jobs

📊 Check your resume score for this job

Improve your chances of getting an interview by checking your resume score before you apply.

Logo of Weekday (YC W21)

Weekday (YC W21)

11 - 50 employees

Founded 2021

☁️ SaaS

🎯 Recruiter

Human Resources • SaaS • Recruitment

Weekday is a modern recruitment platform that combines AI technologies with a vast database of potential candidates, aiming to streamline the hiring process for companies in India. They offer various services, including a proactive outreach approach that helps employers connect with top talent, as well as tools for candidates to easily apply for jobs. Weekday's emphasis on candidate engagement through multiple channels, including email, WhatsApp, and phone calls, sets it apart in the competitive landscape of recruitment agencies.

📋 Description

• Design advanced computational problems requiring the use of domain-specific scientific software • Create tasks that test both precise execution (multi-step workflows, simulations) and strategic reasoning (experiment design, inference from partial data) • Develop problem setups, solution pathways, and validation mechanisms • Calibrate and refine tasks based on model performance to achieve target difficulty levels • Ensure problems emphasize reasoning strategy over brute-force computation

🎯 Requirements

• Demonstrated proficiency with at least one relevant scientific library (via research, open-source work, or industry experience) • Ability to work independently and iterate based on feedback • Comfort working in Linux/terminal environments and remote compute setups • Availability of at least 15–20 hours per week • Graduate-level expertise (MS or PhD preferred) in a relevant STEM field • Hands-on experience using scientific software libraries for real research problems • Strong Python programming skills, including building computational workflows and validators • Ability to design challenging problems that require deep reasoning rather than surface-level solutions • Familiarity with edge cases, limitations, and practical challenges of scientific tools • Experience across multiple domains or tools • Background in evaluation frameworks or benchmarking • Experience in teaching, pedagogy, or problem-set design • Familiarity with reproducible research practices and containerized environments

Apply Now

Similar Jobs

🕒 May 4

3Core Systems, Inc

51 - 200

🤝 B2B

👥 HR Tech

Gen AI/ML Lead managing AI initiatives for a tech company. Requires strong Azure and AI/ML skills with occasional travel to Nashville.

🕒 April 30

Invisible Technologies

201 - 500

🤖 Artificial Intelligence

☁️ SaaS

🏢 Enterprise

Audio Specialist working on assessing AI-driven audio models for customer support interactions. Creating scenarios and evaluating model performance to optimize customer service engagement.

🇺🇸 United States – Remote

💵 $11 - $30 / hour

🔥 Funding within the last year

💰 $100M Series unknown on 2025-10

⏳ Contract/Temporary

🟡 Mid-level

🟠 Senior

🤖 Artificial Intelligence

🕒 April 29

Rainmaker Family

11 - 50

🛍️ eCommerce

📚 Education

👥 B2C

Sales Setter responsible for connecting warm leads and scheduling consultations for AI Business Accelerator. Must engage via calls and texts in a performance-focused role.

🕒 April 29

Rainmaker Family

11 - 50

🛍️ eCommerce

📚 Education

👥 B2C

Sales Closer conducting Zoom consultations with business owners for AI Business Accelerator. Running pre-qualified calls and generating enrollments in a commission-based role.

🕒 April 29

x.ai

201 - 500

🤖 Artificial Intelligence

🔌 API

AI Tutor specializing in multilingual audio capabilities to enhance Grok's voice interactions. Working remotely, curating audio data for diverse languages and accents.

🇺🇸 United States – Remote

💵 $35 - $45 / hour

💰 $10M Series B on 2017-08

⏳ Contract/Temporary

🟡 Mid-level

🟠 Senior

🤖 Artificial Intelligence