Researcher, Evaluations

Job not on LinkedIn

🔥 0 minutes ago

🇺🇸 United States – Remote

💵 $115k - $200k / year

⏰ Full Time

🟡 Mid-level

🟠 Senior

Apply Now
Find Similar Remote Jobs

📊 Check your resume score for this job

Improve your chances of getting an interview by checking your resume score before you apply.

Logo of Epoch AI

Epoch AI

11 - 50 employees

Founded 2022

🤖 Artificial Intelligence

🔬 Science

🤝 B2B

Artificial Intelligence • Science • B2B

Epoch AI is a research and data organization that tracks and analyzes trends in artificial intelligence, maintaining open databases of AI models, data centers, hardware performance, chip sales, and benchmarking results. It publishes papers, reports, newsletters, podcasts, and data insights, and offers custom research and advisory services to policymakers, institutions, and companies to inform decisions about AI development, infrastructure, and governance.

📋 Description

• Create and curate an evaluation suite. Find real-world tasks that serve as challenging tests for practical AI capabilities, and update the tasks over time as AI capabilities evolve. Devise rubrics for evaluating AI performance. • - Evaluate AI systems. Regularly evaluate new, notable AI models and products on the task suite. Update tasks and rubrics to reflect the changing landscape of AI capabilities. • - Communicate your research. Create public-facing reports, blog posts, and data visualizations with your observations. Ensure the evaluations feed into our other research topics and help keep our team informed. • - Conduct data analysis. Analyze evaluation results and compare models across tasks. • - Improve the process. You might automate parts of the workflow, and build out parts of the evaluation into standalone benchmarks.

🎯 Requirements

• Analytical thinking. **You conduct experiments with rigor and care, making sure that findings are well-supported by evidence. • - Grounded, skeptical mentality. **You form your own well-reasoned view of what an AI system can do, distinguishing practical capabilities from hype. • - Comfort with AI agents and tools.** You have experience working with AI agents in the course of your own work, and are comfortable delegating tasks. • - Familiarity with AI benchmarks and evaluations.** You follow AI capabilities at least casually and have opinions on what benchmarks do and don’t tell us. • - Research and data-analysis experience,** including enough comfort with light coding to analyze your own results. • - Strong written communication skills:** You can convey nuanced observations clearly and precisely.

🏖️ Benefits

• Annual salary between **$115,000 – $200,000 USD**, depending on location and experience. • - Fully remote environment, including flexible work hours. • - Competitive global benefits program, including a comprehensive health insurance program—including supplemental benefits specific to a local country, as available and mandated by local law—and life insurance and a pension plan, if applicable in your country. • - Generous paid time off (PTO), including no specific annual limit, with 30 days PTO per year protected, unlimited personal and sick leave, and 4 months paid parental leave for permanent staff with at least 12 months of tenure (prorated parental leave if less than 12 months). • - A flexible and generous expense policy for you to spend on equipment and a large range of productivity tools or learning/development opportunities, including unlimited spending on AI tools, subject to regulations and manager approval. • - Paid work trips, including 3 staff retreats per year and relevant conferences. • - Access to our very well-equipped offices in Berkeley, California, including paid meals, snacks, gym, and more. All staff, independently of where they are based, have access to the office for at least 20 days each year.

Apply Now

Similar Jobs

🔥 4 hours ago

AEI Consultants

201 - 500

Zoning Researcher conducting zoning due diligence research for commercial real estate transactions. Collaborating with Zoning Project Managers to provide accurate information and support project goals.

🇺🇸 United States – Remote

💵 $20 - $22 / hour

⏰ Full Time

🟡 Mid-level

🟠 Senior

🔥 13 hours ago

Long & Foster Companies

10,000+ employees

🏠 Real Estate

Researcher responsible for title commitments and certificates in real estate. Collaborating with attorneys and title companies for efficient closings and document management.

🇺🇸 United States – Remote

⏰ Full Time

🟢 Junior

🟡 Mid-level

🚫👨‍🎓 No degree required

🕒 Yesterday

Crypto.com

1001 - 5000

₿ Crypto

💳 Fintech

🔐 Security

Quant Researcher contributing to a fast-growth trading platform at Crypto.com. Bridging traditional finance and digital markets across diverse asset classes.

🕒 Yesterday

nahc.io

11 - 50

🎯 Recruiter

👥 HR Tech

🏢 Enterprise

Researcher for real-time information flows team at crypto firm identifying news-driven trading opportunities. Support traders and maintain information systems for market alerts.

🕒 Yesterday

nahc.io

11 - 50

🎯 Recruiter

👥 HR Tech

🏢 Enterprise

Researcher working on real-time information flows team focusing on crypto markets and news analysis. Identifying trading opportunities based on real-time news and building alert systems.