Director, AI Alignment and Interpretability

Job not on LinkedIn

🔥 0 minutes ago

🇺🇸 United States – Remote

💵 $195k - $290k / year

⏰ Full Time

🔴 Lead

🤖 Artificial Intelligence

🦅 H1B Visa Sponsor

info
Apply Now
Find Similar Remote Jobs

📊 Check your resume score for this job

Improve your chances of getting an interview by checking your resume score before you apply.

Logo of CrowdStrike

CrowdStrike

5001 - 10000 employees

Founded 2011

🔒 Cybersecurity

☁️ SaaS

🤖 Artificial Intelligence

Cybersecurity • SaaS • Artificial Intelligence

CrowdStrike is a cybersecurity company that provides cloud-based security services to stop breaches. It is recognized as a leader in endpoint protection, identity and cloud security, and managed detection and response. CrowdStrike's platform, Falcon, integrates artificial intelligence to offer real-time visibility, detection, and protection against sophisticated cyber threats. The company is lauded for its effectiveness in securing networks and data, making it a trusted partner for businesses worldwide.

📋 Description

• Own the alignment and interpretability research agenda for security-domain AI • Set priorities, personally lead the hardest open problems, and develop methods that explain model behavior mechanistically: not just what models do, but why, and what that implies at the edges of their training distribution • Build and apply techniques for detecting offensive-misuse signal in model internals, including probing for latent representations of vulnerability knowledge, circuit analysis to understand how security-relevant capabilities are encoded, and activation analysis to surface risk that behavioral testing alone would miss • Work closely with the adversarial evaluation team to close the loop between what they find in testing and what you find in the weights • Develop alignment methodology for security-domain AI and own the evaluation framework that makes it measurable • Contribute original research through publications and external engagement • Recruit, develop, and retain a lean team of research scientists

🎯 Requirements

• MS or PhD in machine learning, computer science, or a related field, with research depth in interpretability, AI alignment, or a closely adjacent area • 8+ years in ML research or engineering, with direct experience doing interpretability or alignment research on large language models • Hands-on expertise with mechanistic interpretability methods (probing classifiers, circuit analysis, activation patching, causal tracing, feature visualization) applied to real models • Experience designing and running alignment evaluations: behavioral testing, capability elicitation, red-lining, or similar methodologies rigorous enough to support meaningful safety claims • Track record of leading and growing researchers while remaining an active technical contributor yourself

🏖️ Benefits

• Market leader in compensation and equity awards • Comprehensive physical and mental wellness programs • Competitive vacation and holidays for recharge • Paid parental and adoption leaves • Professional development opportunities for all employees regardless of level or role • Employee Networks, geographic neighborhood groups, and volunteer opportunities to build connections • Vibrant office culture with world class amenities • Great Place to Work Certified™ across the globe

Apply Now

Similar Jobs

🔥 21 hours ago

XPRIZE

51 - 200

🤝 Non-profit

🔬 Science

🌍 Social Impact

Technical Prize Director leading global competition for energy-efficient AI solutions at XPRIZE. Overseeing operations, partnerships, and technical validation while driving innovation in AI infrastructure.

🇺🇸 United States – Remote

💵 $170k - $200k / year

💰 $1.8M Grant on 2018-10

⏰ Full Time

🔴 Lead

🤖 Artificial Intelligence

🕒 Yesterday

Toast

1001 - 5000

☁️ SaaS

🤝 B2B

Transformation leader driving AI adoption and workflow redesign within Toast's marketing organization. Instrumental in leading change management, governance, and enabling programs.

🕒 Yesterday

3Cloud

501 - 1000

☁️ SaaS

🤖 Artificial Intelligence

🏢 Enterprise

Principal Architect leading Azure programs and architectures at 3Cloud. Guiding technical strategy and mentoring teams while ensuring alignment with client goals.

🕒 Yesterday

Instacart

1001 - 5000

🛍️ eCommerce

🚗 Transport

🛒 Retail

AI Engagement Manager orchestrating complex multi-stakeholder AI engagements for Instacart. Managing partner relationships and engagement economics in a remote-first environment.

🕒 Yesterday

Head of AI Operating System at Raintree managing AI strategy and execution across functions. Driving measurable EBITDA improvement and enhancing customer NPS through agentic workflows.