Senior Speech Applied Scientist

Job not on LinkedIn

October 2

Apply Now
Logo of Omilia - Conversational Intelligence

Omilia - Conversational Intelligence

Artificial Intelligence • eCommerce • Customer Service

Omilia is a leader in Conversational AI, specializing in voice and chat solutions that enable natural, end-to-end customer interactions. Their Omilia Cloud Platform provides advanced AI-driven customer service tools, including real-time agent assistance, voice biometrics for fraud prevention, and data analytics to enhance customer insights. Serving industries such as finance, insurance, retail, automotive, and travel, Omilia focuses on automating customer service while ensuring a secure and personalized experience.

201 - 500 employees

Founded 2002

🤖 Artificial Intelligence

🛍️ eCommerce

đź“‹ Description

• Pioneer Research: Research and implement state-of-the-art approaches for multi-modal LLMs within an end-to-end, speech-to-speech dialog system architecture. • Train & Optimize: Drive the training, fine-tuning, and optimization of our multi-modal LLMs. Your focus will be on enabling full-duplex conversational capabilities, advanced tool-calling, robust barge-in detection, stronger reasoning, in-context learning, and context-aware natural speech generation. • Build Data Flywheels: Design and implement robust data pipelines for the entire multi-modal LLM lifecycle, including data curation, preparation, model training, and rigorous evaluation. • Scale Our Infrastructure: Develop and optimize our training infrastructure to enable fast, large-scale experimentation (multi-GPU and multi-node training), dramatically accelerating our S2S model development cycle. • Collaborate & Deploy: Work closely with product and engineering teams to transform your research models into robust, scalable, and deployable services that our customers will love. • Publish Your Work: Publish pioneering research at top-tier academic conferences while successfully deploying systems into production environments.

🎯 Requirements

• A PhD or MSc in Computer Science, Electrical Engineering, Computational Linguistics, or a related field with a focus on speech processing or deep learning. • Proven experience in one or more of the following areas: Automatic Speech Recognition (ASR), Text-to-Speech (TTS), Natural Language Processing (NLP), and Spoken Language Understanding (SLU). • Deep hands-on experience with deep learning frameworks like PyTorch, TensorFlow, DeepSpeed or Lightning. • Strong background in training, fine-tuning, and evaluating Large Language Models (LLMs), especially in multi-modal or speech-related contexts. • Experience with large-scale model training on distributed, multi-GPU/multi-node infrastructure. • A strong publication record in top-tier conferences (e.g., ICASSP, Interspeech, NeurIPS, ACL) is a plus. • A proactive, collaborative, and innovative mindset with a passion for solving challenging problems.

🏖️ Benefits

• Fixed compensation; • Long-term employment with the working days vacation; • Development in professional growth (courses, training, etc); • Being part of successful cutting-edge technology products that are making a global impact in the service industry; • Proficient and fun-to-work-with colleagues; • Apple gear.

Apply Now
Developed by Lior Neu-ner. I'd love to hear your feedback — Get in touch via DM or support@remoterocketship.com