
1001 - 5000 employees
Upstart is a leading AI lending marketplace partnering with banks and credit unions to expand access to affordable credit. As we transitioned to being a public company, we’re now poised to leverage our domain expertise and revolutionize every aspect of lending and credit risk evaluation. We’ve recently expanded our offerings to include automobile refinancing and we plan to take on more verticals as the business grows.
🕒 February 12
🇺🇸 United States – Remote
💵 $195.3k - $270.4k / year
⏰ Full Time
🔴 Lead
⛑ DevOps & Site Reliability Engineer (SRE)
🦅 H1B Visa Sponsor
Improve your chances of getting an interview by checking your resume score before you apply.

1001 - 5000 employees
Upstart is a leading AI lending marketplace partnering with banks and credit unions to expand access to affordable credit. As we transitioned to being a public company, we’re now poised to leverage our domain expertise and revolutionize every aspect of lending and credit risk evaluation. We’ve recently expanded our offerings to include automobile refinancing and we plan to take on more verticals as the business grows.
• Lead the definition, advocacy, and adoption of SRE principles across engineering teams • Partner with leadership to shape long-term reliability, resiliency, and observability strategies • Champion distributed tracing, real user monitoring (RUM), and key performance metrics such as Largest Contentful Paint (LCP) to improve system visibility and user experience • Build and scale self-healing systems to minimize manual intervention and reduce downtime • Drive enterprise-wide improvements to incident response processes, including those related to Machine Learning systems • Collaborate closely with Development Productivity and Quality teams to improve engineering velocity without sacrificing reliability • Influence technical and operational roadmaps through data-driven insights and hands-on technical contributions • Own and deliver cross-functional initiatives from concept through execution, applying program management skills to align stakeholders and achieve results
• 10+ years combined experience across Software Engineering and Site Reliability Engineering, with a balanced background in both disciplines • Proven track record as an SRE thought leader and evangelist, driving adoption of reliability best practices across organizations • Strong communication and mentoring skills to influence engineers across disciplines • Proficiency in Python, Go, and JavaScript/TypeScript • Proficiency with Infrastructure as Code (Terraform, CDK, CloudFormation, etc.) • Experience building internal tooling from scratch in agile development environments • Expertise with observability, distributed tracing, RUM, LCP, and performance monitoring tools (e.g., Datadog, Prometheus) • Experience with on-call and incident management, including large-scale or ML-related incidents • Strong background in automation and building self-healing systems • Hands-on experience with LLM/GenAI to improve SRE efficiency and processes • Program management skills, including the ability to propose innovative solutions, influence leadership, improve processes, and drive cross-functional projects to completion
• Competitive compensation, including base pay, bonus opportunities, and annual equity grants that vest quarterly • Generous 401(k) plan with Upstart matching $2 for every $1 contributed, up to $15,000 per year • Employee Stock Purchase Plan (ESPP) with discounted stock purchase options for eligible employees • Affordable medical, dental, and vision coverage, with multiple plan options - Upstart covers 90% to 100% of the cost depending on the plans you choose • Health Savings Account contributions from Upstart for eligible plans • Income protection benefits, including company-paid Basic Life, AD&D, and Short- and Long-Term Disability coverage, with options to purchase supplemental coverage • Paid time off, sick and safe time, and company holidays • Paid family and parental leave to support caregiving and major life moments • Family-centered benefits through Carrot and Cleo, supporting fertility, parenthood, and caregiving • Employee Assistance Program (EAP) offering mental health support and life-centered resources • Financial wellness resources, including access to financial planning tools and a financial concierge service • Annual wellness allowance to support your physical and emotional well-being and personal development, based on what matters most to you • Annual productivity allowance to invest in relevant tools and resources you need to do your best work, no matter where you work from • Connection and community through team events and onsites, all-company updates, and employee resource groups (ERGs) • Onsite perks, including catered lunches and fully stocked micro-kitchens when working from one of our four offices, located in the Bay Area, Austin, Columbus, and New York City (opening Summer 2026!).
Apply Now🕒 January 27
Senior DevSecOps Engineer improving cybersecurity posture and supporting compliance for federal requirements in the U.S. Working remotely with less than 10% travel.
Ansible
AWS
Azure
Cloud
Docker
Google Cloud Platform
Kubernetes
OpenShift
Python
Terraform
🕒 January 9
Staff Site Reliability Engineer designing and operating a hybrid cloud environment at PathAI. Focused on implementing SRE best practices and enhancing infrastructure reliability.
🇺🇸 United States – Remote
💵 $165.8k - $224.4k / year
💰 $165M Series C on 2021-05
⏰ Full Time
🔴 Lead
⛑ DevOps & Site Reliability Engineer (SRE)
🦅 H1B Visa Sponsor
Ansible
AWS
Cloud
Grafana
Prometheus
Python
Terraform
🕒 December 24, 2025
SRE / DevOps Manager at Upshop leading reliability and operations engineering team. Responsible for scalability, security, and performance of infrastructure.
AWS
Azure
Cloud
Docker
Google Cloud Platform
Grafana
Kubernetes
MongoDB
Prometheus
Python
Shell Scripting
Terraform
Go
🕒 November 13, 2025
201 - 500
Staff SRE at FloSports improving developer enablement and migrating infrastructure to AWS. Leading technical architecture and critical tooling development with a focus on reliability and automation.
AWS
Google Cloud Platform
JavaScript
Kubernetes
Node.js
Terraform
Go
🕒 November 5, 2025
AWS DevOps Engineer designing cloud-native applications for SAP S/4HANA processes. Optimizing AWS cost/performance in fully remote work environment.
AWS
Cloud
DynamoDB
Kafka