Machine Learning Systems Engineer

Job not on LinkedIn

October 17

Apply Now
Logo of RelationalAI

RelationalAI

Artificial Intelligence • SaaS

RelationalAI is a technology company that enhances decision-making across organizations by utilizing a Knowledge Graph Coprocessor within the Snowflake data cloud. This innovative solution operationalizes the rules, relationships, and decision systems crucial for businesses, integrating powerful decision-making tools such as graph analytics, rules-based reasoning, and mathematical optimization. By utilizing advanced techniques like AI, graph neural networks, and predictive analytics, RelationalAI helps businesses detect fraud, identify influential customers, and optimize operations. The platform aims to simplify intelligent application development and improve business intelligence through expressive and scalable rule-based reasoning and graph analytics.

51 - 200 employees

Founded 2017

🤖 Artificial Intelligence

☁️ SaaS

💰 $40M Series A on 2022-04

📋 Description

• Contribute code and performance improvements to the open source project. • Develop and optimize distributed training algorithms for large language models. • Implement high-performance inference engines and optimization techniques. • Work on integration between vLLM, Megatron-LM, and HuggingFace ecosystems. • Build tools for seamless model training, fine-tuning, and deployment. • Optimize performance of advanced GPU architectures. • Collaborate with the open source community on feature development and bug fixes. • Research and implement new techniques for self-improving AI agents.

🎯 Requirements

• 3+ years of experience in machine learning engineering or research. • Proficiency in both C/C++ and Python • Deep understanding of HPC concepts, including: • MPI (Message Passing Interface) programming and optimization • Bulk Synchronous Parallel (BSP) computing models • Multi-GPU and multi-node distributed computing • Solid understanding of gradient descent and backpropagation algorithms • Experience with transformer architectures and the ability to explain their mechanics • Knowledge of deep learning training and its applications • Understanding of distributed training techniques (data parallelism, model parallelism, pipeline parallelism, large batch training, optimization) • Experience with large-scale distributed training frameworks (Megatron-LM, DeepSpeed, FairScale, etc.). • Familiarity with inference optimization frameworks (vLLM, TensorRT, etc.). • Experience with containerization (Docker, Kubernetes) and cluster management. • Background in systems programming and performance optimization. • Publications: Experience with machine learning research and publications preferred • Ability to read, understand, and implement techniques from recent ML research papers • Demonstrated commitment to open source development and community collaboration.

🏖️ Benefits

• We are all owners in the company and reward you with a competitive salary and equity. • Work from anywhere in the world. • Comprehensive benefits coverage, including global mental health support • Open PTO – Take the time you need, when you need it. • Company Holidays, Your Regional Holidays, and RAI Holidays—where we take one Monday off each month, followed by a week without recurring meetings, giving you the time and space to recharge. • Paid parental leave – Supporting new parents as they grow their families. • We invest in your learning & development • Regular team offsites and global events – Building strong connections while working remotely through team offsites and global events that bring everyone together. • A culture of transparency & knowledge-sharing – Open communication through team standups, fireside chats, and open meetings.

Apply Now
Developed by Lior Neu-ner. I'd love to hear your feedback — Get in touch via DM or support@remoterocketship.com