AI Research Engineer – Multi-Modal, Vision

🔥 0 minutes ago

Apply Now
Find Similar Remote Jobs

📊 Check your resume score for this job

Improve your chances of getting an interview by checking your resume score before you apply.

Logo of Tether.to

Tether.to

11 - 50 employees

Founded 2014

₿ Crypto

💳 Fintech

💸 Finance

Crypto • Fintech • Finance

Tether. to is a leading digital asset company that pioneers the use of stablecoins in the blockchain space. As the most widely adopted stablecoin, Tether tokens are designed to be pegged 1-to-1 with fiat currencies, offering a stable digital asset option for users. The platform facilitates these token transactions across multiple blockchains, enhancing cross-border transactions while maintaining transparency with daily records of total assets and reserves. Tether's initiatives include educational programs promoting digital asset usage, especially targeting regions like the Middle East, Turkey, and the Philippines. Tether thus positions itself as a disruptor in the traditional financial system by enabling a stable, efficient method of handling transactions in the digital currency world.

📋 Description

• Conduct end-to-end research and engineering on vision-language models, covering training, evaluation, and optimization across the full model development lifecycle. • Design and implement post-training pipelines including supervised fine-tuning, knowledge distillation, and reinforcement learning from human feedback. • Develop and maintain high-quality multimodal datasets, including data curation, filtering, and balancing for domain-specific tasks. • Drive model efficiency and deployability, adapting models for resource-constrained environments using compression and optimization techniques. • Design and implement evaluation frameworks and benchmarks to measure model performance, robustness, and real-world task success. • Build and scale training workflows across distributed GPU infrastructure. • Identify and resolve bottlenecks in training pipelines to achieve state-of-the-art model quality on target benchmarks. • Contribute to and leverage open-source ecosystems including models, datasets, and tooling to accelerate development. • Stay current with the latest research in multimodal learning and vision-language systems, translating relevant findings into practical improvements. • Publish research findings in top-tier AI conferences and journals where applicable.

🎯 Requirements

• Degree in Computer Science, Machine Learning, or a related field; MS/PhD preferred. • Strong experience with multimodal post-training workflows including supervised fine-tuning, knowledge distillation, and reinforcement learning from feedback. • Hands-on experience with parameter-efficient fine-tuning and distributed training frameworks. • Demonstrated ability to build and improve vision-language models with measurable results on standard benchmarks or real-world tasks. • Experience adapting models for resource-constrained environments. • Proven open-source contributions in multimodal AI on GitHub or HuggingFace. • Publications at top AI conferences (NeurIPS, ICML, ICLR, CVPR, ECCV etc.)

🏖️ Benefits

• Flexible working hours • Professional development opportunities

Apply Now

Similar Jobs

🔥 10 hours ago

Prolific

51 - 200

🤝 B2B

AI Research Engineer conducting innovative research in AI evaluation methodologies and applications for Prolific. Bridging gaps between research insights and practical applications in AI systems.

🕒 May 22

Brahma

11 - 50

₿ Crypto

💳 Fintech

🔌 API

Machine Learning Researcher focusing on voice synthesis models for audio at Brahma AI. Researching and building deep learning systems to generate expressive, natural-sounding speech.

🕒 February 25

Brahma

11 - 50

₿ Crypto

💳 Fintech

🔌 API

Machine Learning Researcher collaborating with a world-class team at BRAHMA AI to advance generative video models. Focusing on controllable expressions and audio-driven lip synchronisation.

🕒 January 30

Stability AI

51 - 200

🤖 Artificial Intelligence

🤝 B2B

☁️ SaaS

Research Scientist specializing in training and fine-tuning large Vision-Language models. Bridging research breakthroughs with scalable engineering in a fully remote position.