
1001 - 5000 employees
Founded 2019
🤖 Artificial Intelligence
🚗 Transport
💰 Grant on 2020-12
Artificial Intelligence • Transport • Automotive
Cerence Inc. is a global company focused on providing AI-powered solutions, particularly in the automotive industry. They specialize in conversational and generative AI technologies that create intelligent, natural, and personalized interactions between humans and vehicles. With innovations like their proprietary automotive large language models, Cerence enhances user experiences across various forms of transport including cars, two-wheelers, and trucks. The company has over 500 million vehicles shipped with its AI technology, serving more than 80 OEMs and Tier 1 customers worldwide. Cerence is dedicated to continuous advancements in AI, aiming to revolutionize in-car user experiences through fast delivery and seamless integration of their solutions.
🔥 14 minutes ago
Improve your chances of getting an interview by checking your resume score before you apply.

1001 - 5000 employees
Founded 2019
🤖 Artificial Intelligence
🚗 Transport
💰 Grant on 2020-12
Artificial Intelligence • Transport • Automotive
Cerence Inc. is a global company focused on providing AI-powered solutions, particularly in the automotive industry. They specialize in conversational and generative AI technologies that create intelligent, natural, and personalized interactions between humans and vehicles. With innovations like their proprietary automotive large language models, Cerence enhances user experiences across various forms of transport including cars, two-wheelers, and trucks. The company has over 500 million vehicles shipped with its AI technology, serving more than 80 OEMs and Tier 1 customers worldwide. Cerence is dedicated to continuous advancements in AI, aiming to revolutionize in-car user experiences through fast delivery and seamless integration of their solutions.
• Optimize and deploy high ‑ performance LLM inference pipelines • Own inference runtimes across data center, edge, and embedded platforms • Push model performance through quantization, kernel fusion, and cache optimization • Drive latency and throughput improvements that directly impact production products • Enable efficient, reliable deployment without external vendor dependency • Build deep expertise and ownership of: vLLM TensorRT‑LLM llama.cpp QAIRT • Extend and tune inference engines using custom CUDA kernels • Adapt runtimes for constrained and embedded deployment environments • Implement and evaluate quantization strategies: INT8, INT4, FP4, FP8, mixed precision AWQ GPTQ • Balance accuracy, latency, memory footprint, and throughput • Optimize key–value cache performance through: Paging Prefix caching Cache ‑ aware memory layout design • Design and tune: Batching strategies Continuous batching Speculative decoding
• Proven experience optimizing ML inference performance in production • Deep understanding of GPU architecture and memory hierarchies • Hands ‑ on experience with CUDA and low ‑ level performance tuning • Experience deploying models beyond research environments • Critical Technical Skills • Inference engines: vLLM, TensorRT ‑ LLM, llama.cpp, QAIRT • CUDA kernel development and profiling • Quantization techniques: INT8/INT4/FP4/FP8, AWQ, GPTQ • KV cache optimisation and memory layout design • Latency optimisation: batching, speculative decoding, continuous batching
• Annual bonus opportunity • Insurance coverage (medical, dental, vision, life, and disability) • Paid time off • Paid holidays • Company contribution to the RRSP (Registered Retirement Savings Plan) • Equity awards for certain positions and levels • Remote and/or hybrid work available depending on the position
Apply Now🔥 34 minutes ago
Senior Software Engineer developing scalable platform components and supporting cloud infrastructure at Robert Half. Leading design and implementation with a focus on CI/CD and platform reliability.
🇺🇸 United States – Remote
💵 $104k - $153k / year
⏰ Full Time
🟡 Mid-level
🟠 Senior
🧑💻 Full-stack Engineer
🦅 H1B Visa Sponsor
🔥 45 minutes ago
Senior Software Development Engineer developing backend applications to improve healthcare engagements and reduce physician burnout using innovative technologies.
🔥 47 minutes ago
Senior Engineer building next generation AI powered software at Cushman & Wakefield. Leading full stack teams and shaping engineering strategy with modern tools and platforms.
🔥 59 minutes ago
Senior Software Engineer at EasyPost designing and developing software solutions for shipping operations. Collaborating with cross-functional teams to create scalable software products.
🇺🇸 United States – Remote
💵 $180k - $205k / year
💰 $25M Series B on 2021-09
⏰ Full Time
🟠 Senior
🧑💻 Full-stack Engineer
🦅 H1B Visa Sponsor
🔥 1 hour ago
Tech Lead for Consumer Team to drive technical direction and execution for consumer web experience at Koalafi. Leading a team of engineers in modernizing systems and delivering tools for financial needs.
🇺🇸 United States – Remote
💵 $167.7k - $217.1k / year
💰 Debt Financing on 2022-08
⏰ Full Time
🟠 Senior
🧑💻 Full-stack Engineer
🦅 H1B Visa Sponsor