
Artificial Intelligence • SaaS • Hardware
Lambda is a company that provides cloud-based solutions and hardware for AI development. They offer on-demand GPU clusters for multi-node training and fine-tuning, as well as inference endpoints and APIs. Their products include the Lambda GPU Cloud, which features NVIDIA's latest generation of infrastructure for enterprise AI, and customizable GPU workstations and desktops designed for AI and deep learning. Lambda also offers a one-line installation and managed upgrade path for machine learning tools like PyTorch, TensorFlow, and NVIDIA CUDA. By focusing on enabling AI developers, Lambda provides both public and private cloud services with access to powerful NVIDIA Tensor Core GPUs.
November 5
🇺🇸 United States – Remote
💵 $128k - $149k / year
⏰ Full Time
🟠 Senior
🗣️ LLM Engineer
🦅 H1B Visa Sponsor

Artificial Intelligence • SaaS • Hardware
Lambda is a company that provides cloud-based solutions and hardware for AI development. They offer on-demand GPU clusters for multi-node training and fine-tuning, as well as inference endpoints and APIs. Their products include the Lambda GPU Cloud, which features NVIDIA's latest generation of infrastructure for enterprise AI, and customizable GPU workstations and desktops designed for AI and deep learning. Lambda also offers a one-line installation and managed upgrade path for machine learning tools like PyTorch, TensorFlow, and NVIDIA CUDA. By focusing on enabling AI developers, Lambda provides both public and private cloud services with access to powerful NVIDIA Tensor Core GPUs.
• Lead end-to-end deployment of GPU clusters, storage systems, and networking fabric across Lambda’s data centers. • Design and implement data center network topologies optimized for AI and HPC workloads, including high-speed Ethernet and InfiniBand environments. • Oversee rack implementation, cabling, and power/cooling validation for optimal efficiency and scalability. • Collaborate with supply chain, logistics, and operations teams to ensure smooth delivery and installation timelines. • Implement Layer 2/Layer 3 networks, including VLANs, Spine to Leaf architecture, Infiniband interconnect technology. • Partner with network architects to ensure redundancy, scalability, and low-latency interconnects for distributed AI workloads. • Monitor network health, identify bottlenecks, and implement optimizations to maintain peak performance. • Oversee server hardware troubleshooting, including GPUs, NICs, CPUs, and storage components. • Lead root-cause analysis for system issues and drive corrective actions in collaboration with vendors and internal hardware teams. • Develop standard operating procedures (SOPs) for hardware validation, deployment, and maintenance. • Serve as technical project lead for infrastructure rollouts and cluster expansion projects. • Coordinate cross-functional teams — networking, facilities, cloud operations, and hardware engineering — to execute deployments on schedule. • Manage project scope, budgets, risk assessments, and post-deployment reviews. • Communicate status, challenges, and milestones to leadership with clarity and precision. • Maintain detailed network topology diagrams, deployment runbooks, and hardware inventories. • Identify opportunities for process automation and infrastructure standardization across deployments. • Contribute to Lambda’s internal knowledge base and mentor junior engineers on data center best practices.
• Bachelor’s degree in Computer Engineering, Information Technology, or related field. • CCNA (Cisco Certified Network Associate) certification (CCNP or equivalent a plus). • PMP (Project Management Professional) Certification (PMP or equivalent a plus). • 5+ years of experience in data center infrastructure deployment or network operations, preferably in AI, HPC, or cloud environments. • Proven ability to lead complex technical projects and manage multidisciplinary teams. • Strong understanding of data center network design (Layer 2/3, VLAN, Rack elevations, port mapping, Infiniband technologies). • Hands-on expertise in server hardware troubleshooting and rack-level integration.
• Health, dental, and vision coverage for you and your dependents • Wellness and Commuter stipends for select roles • 401k Plan with 2% company match (USA employees) • Flexible Paid Time Off Plan that we all actually use
Apply NowOctober 21
501 - 1000
Senior Generative AI Engineer at Liftoff architecting AI-powered solutions for advertising technology. Pioneering intelligent agents to transform workflows across various core functions.
🇺🇸 United States – Remote
💵 $135k - $227k / year
⏰ Full Time
🟠 Senior
🗣️ LLM Engineer
🦅 H1B Visa Sponsor
Python
October 21
Senior LLM Operations Engineer at N-Power Medicine. Responsible for scaling AI innovation in clinical variable abstraction and note generation through infrastructure and system automation.
AWS
Azure
Cloud
Docker
Google Cloud Platform
Jenkins
Kubernetes
Python
October 9
AI/ML Engineer working on backend services and data analytics for Trellis, a legal data company. Designing data architecture and features for high-speed, large data environments.
Python
October 8
Senior Software Engineer developing Generative AI solutions for Veltris. Leading software development life cycle and driving innovation across products.
AWS
Azure
Cloud
Distributed Systems
Google Cloud Platform
SDLC
September 20
11 - 50
Lead AI infrastructure, data pipelines, governance, and platform APIs at SingleFile compliance SaaS. Drive roadmap, architect decisions, and cross-functional delivery.
Cloud