Lead Member of Technical Staff, Inference Infrastructure

🕒 April 28

🏄 California – Remote

info

⏰ Full Time

🟠 Senior

🖥 Software Engineer

🦅 H1B Visa Sponsor

info
Apply Now
Find Similar Remote Jobs

📊 Check your resume score for this job

Improve your chances of getting an interview by checking your resume score before you apply.

Logo of Cohere

Cohere

11 - 50 employees

🤖 Artificial Intelligence

🏢 Enterprise

☁️ SaaS

Artificial Intelligence • Enterprise • SaaS

Cohere is a leading enterprise AI platform optimized for generative AI, search and discovery, and advanced retrieval. The company offers AI-powered applications designed to augment and elevate the global workforce, helping businesses thrive in the AI era. Cohere provides solutions such as embedding and reranking models, allowing enterprises to efficiently retrieve information and build powerful applications. The company offers flexible deployment options for enterprise-grade AI, on any cloud or on-premises, and provides extensive developer resources and support. Cohere is committed to scaling intelligence to serve humanity, making intelligence abundant, affordable, and accessible.

📋 Description

• Join the Model Serving team at Cohere and provide technical leadership across multiple teams • Drive the architecture and strategy for deploying optimized NLP models to production in low latency, high throughput, and high availability environments • Serve as a key point of contact for customers, leading the design of customized deployments to meet their specific needs • Mentor engineers to raise the technical bar across the team

🎯 Requirements

• 8+ years of engineering experience running production infrastructure at a large scale, with a track record of technical leadership • Demonstrated experience leading the architecture and design of large, highly available distributed systems with Kubernetes and GPU workloads on those clusters • Deep expertise with Kubernetes dev and production coding and support, including setting team-wide standards and best practices • Extensive experience across GCP, Azure, AWS, OCI, and multi-cloud on-prem / hybrid serving environments, with the ability to guide strategic infrastructure decisions • Proven ability to lead the design, deployment, support, and troubleshooting of complex Linux-based computing environments at scale • Experience owning compute/storage/network resource and cost management at an organisational level, including optimisation strategies • Exceptional collaboration and communication skills, with experience mentoring engineers and leading cross-functional initiatives to build mission-critical systems • The grit and adaptability to both solve and guide others through complex technical challenges that evolve day to day • Strong expertise in the computational characteristics of accelerators (GPUs, TPUs, and/or custom accelerators), and how to leverage them to drive latency and throughput improvements at scale • Deep knowledge of distributed systems, with experience establishing patterns and practices across engineering teams • Proficiency in Golang, C++ or other languages designed for high-performance scalable servers, with the ability to set coding standards and conduct senior-level technical reviews

🏖️ Benefits

• An open and inclusive culture and work environment • Work closely with a team on the cutting edge of AI research • Weekly lunch stipend, in-office lunches & snacks • Full health and dental benefits, including a separate budget to take care of your mental health • 100% Parental Leave top-up for up to 6 months • Personal enrichment benefits towards arts and culture, fitness and well-being, quality time, and workspace improvement • Remote-flexible, offices in Toronto, New York, San Francisco, London and Paris, as well as a co-working stipend • 6 weeks of vacation (30 working days!)

Apply Now

Similar Jobs

🕒 April 28

Husch Blackwell

1001 - 5000

🤝 B2B

📋 Compliance

🏢 Enterprise

Lead Developer serving as a technical lead responsible for enterprise application features at Husch Blackwell. Overseeing teamwork, mentoring developers, and maintaining high-performing enterprise systems.

🇺🇸 United States – Remote

💵 $105k - $237k / year

⏰ Full Time

🟠 Senior

🖥 Software Engineer

🦅 H1B Visa Sponsor

info

🕒 April 27

Kodex

11 - 50

📋 Compliance

🔒 Cybersecurity

💳 Fintech

Engineering Manager leading Core Portal team at Kodex, enhancing secure data exchange processes for organizations. Drives AI proficiency and team development in a remote startup environment.

🇺🇸 United States – Remote

💵 $170k - $220k / year

💰 Venture Round on 2022-10

⏰ Full Time

🟡 Mid-level

🟠 Senior

🖥 Software Engineer

🕒 April 25

PeerIslands

11 - 50

🤖 Artificial Intelligence

☁️ SaaS

🏢 Enterprise

Polyglot Developer working with Python, RAG architectures and LLM integrations. Focused on document ingestion and real-time responses in a remote setting.

🇺🇸 United States – Remote

⏰ Full Time

🟡 Mid-level

🟠 Senior

🖥 Software Engineer

🕒 April 25

PeerIslands

11 - 50

🤖 Artificial Intelligence

☁️ SaaS

🏢 Enterprise

Backend Developer maintaining backend systems and collaborating on AI technology projects in a remote setting. Requires 5-10 years experience in software development with multiple programming languages.

🇺🇸 United States – Remote

⏰ Full Time

🟡 Mid-level

🟠 Senior

🖥 Software Engineer

🕒 April 25

Vytwo Technologies Inc

201 - 500

🤝 B2B

🏢 Enterprise

🎯 Recruiter

Alteryx Workflow Developer developing and maintaining complex workflows using IBM Unica software for M&R Workflow Programming Team. Collaborating with business partners in campaign communications design.

🇺🇸 United States – Remote

⏰ Full Time

🟡 Mid-level

🟠 Senior

🖥 Software Engineer