Research Lead – Principal Scientist, Manager Post-Training, Alignment, Reinforcement Learning

🕒 May 13

🇨🇦 Canada – Remote

⏰ Full Time

🟠 Senior

🧬 Research Scientist

Apply Now
Find Similar Remote Jobs

📊 Check your resume score for this job

Improve your chances of getting an interview by checking your resume score before you apply.

Logo of Autodesk

Autodesk

10,000+ employees

Founded 1982

📱 Media

Architecture • Engineering • Media

Autodesk is a global leader in software for designers, engineers, builders, and creators. The company provides a comprehensive suite of design and engineering applications including popular products like AutoCAD, Revit, and 3ds Max. Through its Design and Make Platform, Autodesk empowers professionals across various industries to design, visualize, and manage projects efficiently, facilitating innovation and sustainability in architecture, engineering, construction, and manufacturing.

📋 Description

• Own post-training strategy for model development — from RLHF and preference optimization to agentic systems and long-horizon reasoning • Develop novel algorithms that improve model reliability, controllability, and alignment • Make principled architectural decisions about when to address challenges at the pre-training, post-training, or system level • Design and run experiments that shape model behavior, robustness, and reasoning quality • Partner with infrastructure teams to build scalable, reproducible post-training workflows • Contribute to publications, patents, and Autodesk's external research visibility • Design evaluation frameworks for long-horizon reasoning, tool use, agentic behavior, safety, and real-world workflow completion • Lead rigorous model analysis and interpretability efforts • Drive human-in-the-loop evaluation with high annotation quality and sound scientific methodology • Establish model readiness criteria and provide go/no-go recommendations for releases • Manage, mentor, and grow a team of AI scientists • Set technical direction and research priorities across post-training and alignment initiatives • Foster a research culture grounded in scientific rigor, reproducibility, and fast iteration • Help recruit world-class talent across ML, RL, alignment, and foundation models • Partner closely with pre-training teams, infrastructure, product organizations, and other stakeholders • Translate research trade-offs into clear, decision-ready guidance for leadership

🎯 Requirements

• Deep hands-on expertise in reinforcement learning for foundation models, and fluency with post-training methods (RLHF, RLAIF, DPO, PPO, or adjacent approaches) • Proven experience leading or mentoring technical research teams — whether in an academic lab, AI research organization, or industry setting • Strong intuition for model behavior, alignment challenges, and post-training trade-offs • Experience designing evaluation systems and thinking rigorously about what it means for a model to be ready • Ability to communicate complex technical trade-offs clearly to both technical and non-technical audiences • A PhD or equivalent depth of industry research experience in ML, RL, AI, or a related field

🏖️ Benefits

• health insurance • retirement plans • paid time off • flexible work arrangements • professional development • bonuses • stock options • equipment allowances • wellness programs

Apply Now

Similar Jobs

🕒 May 8

Precision Medicine Group

1001 - 5000

🧬 Biotechnology

⚕️ Healthcare Insurance

💊 Pharmaceuticals

Senior Research Scientist specializing in evidence synthesis for health technology assessments, collaborating remotely across sophisticated projects.

🇨🇦 Canada – Remote

💰 $35.2M Venture Round on 2021-03

⏰ Full Time

🟠 Senior

🧬 Research Scientist

🕒 April 30

MaintainX

51 - 200

Senior Applied Scientist at MaintainX enhancing scheduling systems through Python optimization and collaboration with product design teams. Focused on real-world applications of AI-driven scheduling.

🇨🇦 Canada – Remote

⏰ Full Time

🟠 Senior

🧬 Research Scientist

🕒 January 19

Prolific

51 - 200

🤝 B2B

AI Trainer – Research Scientist at Prolific, evaluating performance of advanced AI models. Requires scientific research experience with flexible hours and remote work.

🇨🇦 Canada – Remote

💵 £50 / hour

⏰ Full Time

🟡 Mid-level

🟠 Senior

🧬 Research Scientist

🕒 January 15

FirstPrinciples Holding Company

51 - 200

🤝 B2B

🏢 Enterprise

💸 Finance

Research Fellow at FirstPrinciples developing cutting-edge AI Physicist for theoretical research. Engaging deeply with advanced AI and physics through innovative methods and collaboration.

🇨🇦 Canada – Remote

⏰ Full Time

🟡 Mid-level

🟠 Senior

🧬 Research Scientist

🕒 September 29, 2025

Chelsea Avondale

1 - 10

🔬 Science

🤖 Artificial Intelligence

Research Scientist developing wildfire, flood, and windstorm models for Chelsea Avondale's home insurance risk platform.

🇨🇦 Canada – Remote

⏰ Full Time

🟡 Mid-level

🟠 Senior

🧬 Research Scientist