
10,000+ employees
Founded 1982
đ± Media
Architecture âą Engineering âą Media
Autodesk is a global leader in software for designers, engineers, builders, and creators. The company provides a comprehensive suite of design and engineering applications including popular products like AutoCAD, Revit, and 3ds Max. Through its Design and Make Platform, Autodesk empowers professionals across various industries to design, visualize, and manage projects efficiently, facilitating innovation and sustainability in architecture, engineering, construction, and manufacturing.
đ„ 0 minutes ago
đ„ Idaho, Texas â Remote
đ” $117k - $209.3k / year
â° Full Time
đ Senior
â DevOps & Site Reliability Engineer (SRE)
Improve your chances of getting an interview by checking your resume score before you apply.

10,000+ employees
Founded 1982
đ± Media
Architecture âą Engineering âą Media
Autodesk is a global leader in software for designers, engineers, builders, and creators. The company provides a comprehensive suite of design and engineering applications including popular products like AutoCAD, Revit, and 3ds Max. Through its Design and Make Platform, Autodesk empowers professionals across various industries to design, visualize, and manage projects efficiently, facilitating innovation and sustainability in architecture, engineering, construction, and manufacturing.
âą Serve as a primary owner for the reliability, availability, performance, operability, and capacity of one or more production services âą Deploy, operate, maintain, and continuously improve production services running in Autodesk GovCloud environments âą Partner with engineering teams to ensure services are designed with reliability, scalability, security, and operability in mind âą Define and operate reliability practices such as SLOs/SLIs, error budgets, production readiness reviews, service reviews, and operational health reviews âą Build automation to improve deployment safety, operational efficiency, incident response, and service recovery âą Design, develop, and maintain software, automation, and tooling that improve the reliability, scalability, and efficiency of production systems âą Implement and improve monitoring, alerting, logging, tracing, and observability capabilities across supported services âą Lead and participate in incident response, troubleshooting, and post-incident reviews focused on learning and continuous improvement âą Develop and maintain operational documentation, runbooks, and recovery procedures âą Scale and enhance resilience testing and Gameday practices to validate system behavior, recovery capabilities, and operational readiness âą Continuously identify and eliminate operational toil through software engineering, automation, and process improvement âą Ensure supported services remain compliant with Autodesk security, privacy, and regulatory requirements, including FedRAMP and related controls where applicable âą Participate in a 24x7 on-call rotation for production services
âą B.S. or higher in Computer Science, Engineering, or a related technical discipline, or equivalent practical experience âą 7+ years of experience in Site Reliability Engineering, Software Engineering, Platform Engineering, Cloud Infrastructure, or Production Operations âą Experience operating and supporting customer-facing production services in large-scale cloud environments âą Strong understanding of reliability engineering principles, including SLOs/SLIs, observability, incident management, capacity planning, production readiness, and automation âą Experience with AWS, Azure, or other public cloud platforms âą Experience developing automation using languages such as Python, Go, Java, PowerShell, Bash, or similar âą Experience with Infrastructure as Code, CI/CD pipelines, deployment automation, and modern cloud operations practices âą Understanding of security, compliance, and operational risk management in production environments âą Strong written and verbal communication skills.
âą Health and financial benefits âą Time away and everyday wellness
Apply Nowđ„ 36 minutes ago
Senior Database Reliability Engineer overseeing Cloud based SQL Server infrastructures at Coupa. Leading database architecture and ensuring reliable, high-performance data solutions.
đ„ 4 hours ago
11 - 50
Site Reliability Engineer enhancing AWS-based platform reliability at Pinterest and scaling Kubernetes workloads. Operating and improving cloud-native infrastructure with a focus on automation and resilience.
đșđž United States â Remote
đ” $114.3k - $235.3k / year
â° Full Time
đĄ Mid-level
đ Senior
â DevOps & Site Reliability Engineer (SRE)
đ„ 4 hours ago
DevSecOps Lead managing secure software development lifecycle at YipitData. Collaborating across departments to strengthen security practices within engineering operations.
đ„ 5 hours ago
DevSecOps Lead building secure software development lifecycle and vulnerability management at YipitData. Leading cross-functional collaboration to implement security standards across software development.
đ„ 16 hours ago
10,000+ employees
Site Reliability Engineer collaborating with teams to establish SRE practices and participate in system design reviews at Guidehouse. Focused on AWS cloud infrastructure and promoting automation.
đșđž United States â Remote
đ” $80k - $133k / year
đ° Grant on 2023-02
â° Full Time
đĄ Mid-level
đ Senior
â DevOps & Site Reliability Engineer (SRE)
đŠ H1B Visa Sponsor