Principal Performance Engineer – Lead

🕒 April 7

🍂 Massachusetts – Remote

info

💵 $169.3k - $304.7k / year

⏰ Full Time

🟠 Senior

👷🏻‍♀️ Engineer

🦅 H1B Visa Sponsor

info
Apply Now
Find Similar Remote Jobs

📊 Check your resume score for this job

Improve your chances of getting an interview by checking your resume score before you apply.

Logo of Akamai Technologies

Akamai Technologies

5001 - 10000 employees

🔒 Cybersecurity

💰 Post-IPO Equity on 2001-07

Cloud Computing • Cybersecurity • Content Delivery

Akamai Technologies is a leading cloud services provider that specializes in delivering security, cloud computing, and content delivery solutions. It offers a range of services such as API security, DDoS protection, and performance optimization for web applications, ensuring secure and reliable user experiences. With a robust global infrastructure, Akamai empowers businesses to streamline their digital presence while safeguarding against various cyber threats and enhancing application performance.

📋 Description

• Optimize inference performance across the Akamai Inference Cloud • Collaborate closely with hardware performance engineers to deliver end-to-end optimization • Apply and evaluate quantization, distillation, and pruning techniques to optimize model performance while preserving accuracy • Design hardware-aware model placement and scheduling strategies to match models with optimal compute resources • Implement and tune speculative decoding, KV-cache optimization, and batching strategies to improve inference throughput and latency • Build benchmarking and profiling pipelines to measure model-layer performance across architectures, hardware, and serving configurations • Mentor and guide engineers on the team through code reviews, design discussions, and technical problem-solving • Collaborate with hardware performance engineers to identify and resolve end-to-end performance bottlenecks across the inference stack

🎯 Requirements

• 12+ years of relevant experience with a Bachelor's or Master's degree in Computer Science, Machine Learning, or a related field • Possess hands-on experience optimizing LLM inference performance (quantization, speculative decoding, model compression, etc.) • Have a solid understanding of transformer architectures and how design choices impact latency, throughput, and accuracy • Possess experience with inference serving frameworks such as vLLM, TensorRT-LLM, Triton, or similar systems • Be proficient in Python and C++ with experience profiling and optimizing compute-intensive workloads • Have familiarity with hardware-aware optimization, including GPU/accelerator scheduling and memory management trade-offs

🏖️ Benefits

• healthcare • 401K savings plan • company holidays • vacation (in the form of PTO) • sick time • family friendly benefits including parental leave • employee assistance program including a focus on mental and financial wellness

Apply Now

Similar Jobs

🕒 April 7

The College Board

1001 - 5000

📚 Education

🤝 Non-profit

Sr. Engineer focused on Platform Threat Intelligence at College Board, translating adversary insights into measurable platform trust improvements through collaborative efforts.

🗣️🇨🇳 Chinese Required

🗣️🇻🇳 Vietnamese Required

🕒 April 7

GAI Consultants, Inc.

501 - 1000

⚡ Energy

🚗 Transport

🏛️ Government

Transmission Project Engineer 1 in GAI's Power Delivery team, focusing on high-voltage transmission line engineering and client advisory roles.

🕒 April 7

GE HealthCare

10,000+ employees

💊 Pharmaceuticals

Senior Clinical Engineer driving clinical development of imaging products and Software as a Medical Device. Collaborating with multidisciplinary teams to ensure regulatory compliance and product success.

🕒 April 7

divcon

201 - 500

🤝 B2B

⚡ Energy

🏢 Enterprise

Lead PLC Engineer designing and developing PLC control systems for HVAC and critical systems. Providing technical guidance and mentoring to engineering teams, ensuring quality and consistency.

🕒 April 6

Akeyless Security

51 - 200

🔒 Cybersecurity

☁️ SaaS

🏢 Enterprise

Escalation Engineer supporting customer success and engineering teams remotely. Working on technical problems and collaborating cross-functionally in a cloud-native security company.

🇺🇸 United States – Remote

💵 $160k - $180k / year

💰 $65M Series B on 2022-11

⏰ Full Time

🟡 Mid-level

🟠 Senior

👷🏻‍♀️ Engineer