Staff Systems Engineer – Cloud Operations, Support

Job not on LinkedIn

🕒 April 29

Apply Now
Find Similar Remote Jobs

📊 Check your resume score for this job

Improve your chances of getting an interview by checking your resume score before you apply.

Logo of Cadence Design Systems

Cadence Design Systems

10,000+ employees

Cadence is a pivotal leader in electronics and system design, building upon more than 30 years of computational software expertise. The company applies its underlying Intelligent System Design strategy to deliver software, hardware and IP that turn design concepts into reality. Cadence customers are the world’s most innovative companies, delivering extraordinary electronic products from chips to boards to complete systems for the most dynamic market applications, including hyperscale computing, 5G communications, automotive, mobile, aerospace, consumer, industrial and healthcare. For eight years in a row, Fortune magazine has named Cadence one of the 100 Best Companies to Work For.

📋 Description

• Supporting multiple geological locations to serve user communities across North America, Europe, and Asia sites. • Focusing on improving customer productivity and committing to customer success. • Driving the overall operational strategy for internal High-Performance Compute (HPC) clusters in Cadence cloud. • Maintaining, enhancing, monitoring, reporting, and improving its efficiency.

🎯 Requirements

• 8+ years of technical experience architecting, managing, and improving a HPC environment running Linux. • At least 3 years working in a global group, coordinating support, strategies, projects, and operations across multiple geographies in a team-oriented approach • Solid understanding and proven operational experience with HPC clusters, job submission/management technologies, cloud, and associated management tools. • Proven experience working directly with engineering teams to collaboratively develop solutions to optimize their working environment (Direct EDA experience desired) • Proven experience in capacity and performance management, optimizing performance, ensuring adequate capacity, working with customers on optimization of their workloads, and development and maintenance of key performance indicators • A proven process focus shown through documentation, change management, incident management and problem-resolution activities • Extensive hands-on experience with Docker: image management, container orchestration, and troubleshooting. • Deep expertise in Linux system administration (RHEL preferred), including networking, storage, and performance tuning. • Familiarity with user authentication and integration using systems like LDAP or Active Directory. • Solid understanding and proven operational experience with HPC clusters, job submission/management technologies, cloud, and associated management tools. • Hands-on GPU Cluster Management: Experience in configuration, installation, and optimization of GPU server clusters. • Hands-on technical experience managing GPU VMS, installing, configuring instances and other services over OpenStack • Automation & Monitoring: Develop and maintain automation scripts using languages like Python, Bash, or Perl to streamline system maintenance, deployment, and reporting. • Strong problem-solving and communication skills with the ability to work in a multi platform, cross-functional, and geographically distributed team.

🏖️ Benefits

• Health insurance • 401(k) matching • Flexible work hours • Paid time off • Professional development

Apply Now

Similar Jobs

🕒 April 29

NetSEA Technologies, LLC

11 - 50

🔒 Cybersecurity

🏛️ Government

📡 Telecommunications

Systems Architect providing technical leadership in the design and integration of tactical communications architectures. Focusing on modernization of Army and joint network capabilities with emerging technologies.

Cloud

🕒 April 28

WorkOS

51 - 200

🔌 API

🏢 Enterprise

🤝 B2B

Systems Engineer at WorkOS architecting IT automation and identity infrastructure for enterprise-ready developer tools. Focusing on scaling systems, device management, and continuous improvement.

Cloud

DNS

Firewalls

Jamf

MacOS

Python

Terraform

🕒 April 27

DistantJob

51 - 200

🎯 Recruiter

👥 HR Tech

☁️ SaaS

IT Systems Engineer providing remote technical support for a Managed Services Provider. Seeking proactive candidates with extensive experience in network management and cloud technologies.

AWS

Azure

Cloud

Firewalls

VMware

🕒 April 21

Quartermaster

1 - 10

🤖 Artificial Intelligence

🔒 Cybersecurity

🚗 Transport

Principal Systems Engineer at Quartermaster AI, owning the technical architecture of RF systems. Leading design decisions in innovative AI-driven maritime solutions.

Cloud

Distributed Systems

🕒 April 21

INNOVIM

51 - 200

🏛️ Government

🔬 Science

🤖 Artificial Intelligence

Product Owner supporting NASA’s EOSDIS with overseeing technical software development and operations team efforts. Working with a team of engineers for data access services within NASA's Earth science missions.

AWS

Docker

Open Source