GPU Compute, MLIR Engineer

Job not on LinkedIn

🔥 0 minutes ago

Apply Now
Find Similar Remote Jobs

📊 Check your resume score for this job

Improve your chances of getting an interview by checking your resume score before you apply.

Logo of Weekday (YC W21)

Weekday (YC W21)

11 - 50 employees

Founded 2021

☁️ SaaS

🎯 Recruiter

Human Resources • SaaS • Recruitment

Weekday is a modern recruitment platform that combines AI technologies with a vast database of potential candidates, aiming to streamline the hiring process for companies in India. They offer various services, including a proactive outreach approach that helps employers connect with top talent, as well as tools for candidates to easily apply for jobs. Weekday's emphasis on candidate engagement through multiple channels, including email, WhatsApp, and phone calls, sets it apart in the competitive landscape of recruitment agencies.

📋 Description

• Develop and optimize GPU compute kernels targeting OpenCL and Vulkan compute backends for high-throughput AI/ML workloads. • Design, build, and extend MLIR dialects across multiple abstraction levels—including frontend dialects, graph-level IR, tensor IR (e.g., Linalg, Tensor, TOSA), and runtime/low-level dialects—to enable efficient end-to-end model compilation. • Implement and maintain MLIR-based compiler passes and transformations, including tiling, fusion, bufferization, vectorization, and lowering pipelines targeting OpenCL and Vulkan GPU backends. • Conduct profiling and bottleneck analysis of compiled kernels using GPU counters and vendor-specific profilers, and drive performance improvements through compiler-level optimizations. • Build and maintain GPU runtime infrastructure for both OpenCL and Vulkan, including memory management, pipeline setup, command buffer orchestration, and resource scheduling. • Develop and extend code generation pipelines, enabling automatic lowering from tensor IR through MLIR to efficient OpenCL and Vulkan GPU kernels. • Implement performance-critical schedules—including tiling, loop fusion, parallelism, and caching strategies—within MLIR-based backends targeting OpenCL and Vulkan runtimes. • Collaborate with framework teams to optimize end-to-end model lowering for computer vision and LLM workloads using MLIR compilation stacks. • Design and implement robust compiler and runtime components using modern C/C++, leveraging advanced programming paradigms for high-performance systems.

🎯 Requirements

• Strong hands-on experience with the MLIR framework, including authoring and extending custom dialects, writing compiler passes, and building end-to-end lowering pipelines. • Deep expertise across MLIR abstraction levels: • - Frontend dialects – ingestion and representation of ML models (e.g., TOSA, StableHLO, ONNX-MLIR) • - Graph-level IR – high-level operation fusion, shape inference, and graph transformations • - Tensor IR level – structured operation representation using Linalg, Tensor, and Vector dialects; tiling and fusion strategies • - Runtime/low-level dialects – Bufferization, MemRef, SCF, GPU, and LLVM dialects for final code generation • Strong hands-on experience in OpenCL programming, including kernel development, memory model, work-group/work-item optimization, and OpenCL runtime management. • Solid understanding of Vulkan compute programming, including descriptor management, compute pipelines, synchronization primitives, and Vulkan runtime internals. • Strong understanding of GPU architecture, memory hierarchies, and asynchronous compute. • Proficiency in C/C++ for system-level development. • Experience with kernel profiling and bottleneck analysis on GPU platforms. • Strong background in machine learning fundamentals, covering both Computer Vision (CV) and Large Language Model (LLM) workloads.

Apply Now

Similar Jobs

🔥 11 hours ago

bp

10,000+ employees

⚡ Energy

Reservoir Engineer advising partners on Production Management for BP's integrated team. Collaborating on resource progression and reservoir management in Gas & Low Carbon Energy.

🕒 Yesterday

LiveKit

11 - 50

🔌 API

🤖 Artificial Intelligence

📡 Telecommunications

Forward Deployed Engineer at LiveKit designing and implementing real-time communication applications. Collaborate with customers on real-time audio, video, and AI application scaling.

🇮🇳 India – Remote

💵 ₹9M - ₹11M / year

💰 Venture Round on 2022-09

⏰ Full Time

🟡 Mid-level

🟠 Senior

👷🏻‍♀️ Engineer

Cloud

Distributed Systems

Node.js

Python

Rust

Go

🕒 Yesterday

Shuru

51 - 200

🤖 Artificial Intelligence

🤝 B2B

🏢 Enterprise

Next.js Engineer developing scalable web applications for Shuru Technologies. Collaborating with teams to deliver reliable user experiences using modern web technologies.

AWS

GraphQL

JavaScript

Material UI

Next.js

React

Redux

TypeScript

🕒 5 days ago

PeerIslands

11 - 50

🤖 Artificial Intelligence

☁️ SaaS

🏢 Enterprise

Polyglot Developer responsible for designing and implementing software architectures remotely. Collaborating with teams and mentoring developers across multiple programming languages.

AWS

Azure

Cloud

Distributed Systems

Docker

Google Cloud Platform

Java

Kubernetes

Microservices

Node.js

NoSQL

Python

Ruby

SQL

Go

🕒 5 days ago

SRM Technologies

501 - 1000

☁️ SaaS

Atlassian Platform Engineer supporting and enhancing Dropbox’s Atlassian ecosystem, focusing on administration and operational excellence. Hands-on role improving platform reliability and reducing technical debt.

Cloud

Groovy

JavaScript

Python