Site Reliability Engineer

10,000+ employees

Founded 1903

🚗 Transport

💰 Post-IPO Debt on 2023-08

Transport • Manufacturing • Sustainability

Ford Motor Company is a globally renowned automotive company based in the United States, established by Henry Ford. The company is committed to building a better world where every individual has the freedom to move and follow their dreams. Ford is dedicated to innovation, with a focus on services, experiences, and software alongside its traditional vehicle manufacturing. The company is actively involved in sustainability initiatives and aims to meet ambitious environmental targets. Ford values service, community impact, and strives to combine business success with social and environmental responsibility. With a rich history of over 121 years, Ford continues to adapt and lead in the evolving automotive landscape.

Site Reliability Engineer

🕒 May 9

🏄 California – Remote

💵 $85.4k - $192.9k / year

⏰ Full Time

🟡 Mid-level

🟠 Senior

⛑ DevOps & Site Reliability Engineer (SRE)

🦅 H1B Visa Sponsor

Cloud

Google Cloud Platform

JavaScript

Kubernetes

Postgres

Terraform

Apply Now

Find Similar Remote Jobs

📊 Check your resume score for this job

Improve your chances of getting an interview by checking your resume score before you apply.

Ford Motor Company

10,000+ employees

Founded 1903

🚗 Transport

💰 Post-IPO Debt on 2023-08

Transport • Manufacturing • Sustainability

📋 Description

• Write, configure, and deploy code in Go and Javascript that improves service reliability for existing or new systems; set standard for others with respect to code quality. • Work within Google Cloud Platform (GCP) infrastructure, optimizing performance and cost, and scaling resources to meet demand. • Provide helpful and actionable feedback and review for code or production changes. • Drive repair/optimization of complex systems with consideration towards a wide range of contributing factors. • Lead debugging, troubleshooting, and analysis of service architecture and design. • Participate in on-call rotation. • Write documentation: design, system analysis, runbooks, playbooks. Provide design feedback and uplevel design skills of others. • Implement and manage SRE monitoring application backends using Golang, Postgres, and OpenTelemetry. Develop tooling using Terraform and other IaC tools to ensure visibility and proactive issue detection across our platforms. • Collaborate with development teams to enhance system reliability and performance, applying a platform engineering mindset to system administration tasks. • Develop and maintain automated solutions for operational aspects such as on-call monitoring, performance tuning, and disaster recovery. • Troubleshoot and resolve issues in our dev, test, and production environments. • Participate in postmortem analysis and create preventative measures for future incidents. • Implement and maintain security best practices across our infrastructure, ensuring compliance with industry standards and internal policies. Participate in security audits and vulnerability assessments. • Participate in capacity planning and forecasting efforts to ensure our systems can handle future growth and demand. Analyze trends and make recommendations for resource allocation. • Identify and address performance bottlenecks through code profiling, system analysis, and configuration tuning. Implement and monitor performance metrics to proactively identify and resolve issues. • Develop, maintain, and test disaster recovery plans and procedures to ensure business continuity in the event of a major outage or disaster. Participate in regular disaster recovery exercises. • Contribute to internal knowledge bases and documentation.

🎯 Requirements

• Bachelor’s degree in Computer Science, Engineering, Mathematics or equivalent work experience. • 3+ years of experience as an SRE, Software Engineer, DevOps Engineer or similar role. • Solid programming skills in Golang and scripting languages, with a good understanding of software development best practices. • Proficient with monitoring and observability tools, particularly OpenTelemetry, Dynatrace or other tools. • Proficient with cloud services, with a strong preference for Kubernetes and Google Cloud Platform (GCP) experience. • Experience with relational and document databases. • Ability to debug, optimize code, and automate routine tasks. • Strong problem-solving skills and the ability to work under pressure in a fast-paced environment. • Excellent verbal and written communication skills.

🏖️ Benefits

• Immediate medical, dental, vision and prescription drug coverage • Flexible family care days, paid parental leave, new parent ramp-up programs, subsidized back-up child care and more • Family building benefits including adoption and surrogacy expense reimbursement, fertility treatments, and more • Vehicle discount program for employees and family members and management leases • Tuition assistance • Established and active employee resource groups • Paid time off for individual and team community service • A generous schedule of paid holidays, including the week between Christmas and New Year's Day • Paid time off and the option to purchase additional vacation time.

Apply Now

Similar Jobs

Forward Deployment Engineer – AI & Agentic Solutions

🕒 May 9

Visionary Integration Professionals (VIP)

501 - 1000

🤝 B2B

🏛️ Government

Forward Deployment Engineer working on AI-enabled solutions for clients at Visionary Integration Professionals. Collaborating with customers to design, prototype, and support implementations across various sectors.

🇺🇸 United States – Remote

💵 $130k - $165k / year

💰 Debt Financing on 2018-11

⏰ Full Time

🟡 Mid-level

🟠 Senior

⛑ DevOps & Site Reliability Engineer (SRE)

AWS

Azure

ERP

JavaScript

Python

TypeScript

Fullstack Developer – DevOps

🕒 May 9

Datmos

51 - 200

🛍️ eCommerce

🤝 B2B

🏢 Enterprise

Full-Stack Developer / DevOps at Datmos transforming AI models into production-grade products. Creating secure interfaces and robust cloud infrastructures within the AI Task Force.

🇺🇸 United States – Remote

⏰ Full Time

🟡 Mid-level

🟠 Senior

⛑ DevOps & Site Reliability Engineer (SRE)

Amazon Redshift

AWS

Azure

BigQuery

Cloud

Docker

ETL

Google Cloud Platform

GraphQL

JavaScript

Kubernetes

Next.js

Node.js

NoSQL

Python

React

SQL

Terraform

Senior Site Reliability Engineer

🕒 May 9

TechInsights

201 - 500

Senior Site Reliability Engineer building reliability and AI operations for semiconductor workflows. Owning strategic initiatives, collaborating across teams, and driving reliability standards and practices.

🇺🇸 United States – Remote

💵 $149.1k - $157.8k / year

⏰ Full Time

🟠 Senior

⛑ DevOps & Site Reliability Engineer (SRE)

AWS

Cloud

Docker

Java

Kubernetes

Python

Spring

Spring Boot

SpringBoot

Terraform

Senior DevOps Engineer

🕒 May 8

DroneDeploy

201 - 500

🚀 Aerospace

Senior DevOps Engineer optimizing DroneDeploy's cloud infrastructure and CI/CD pipelines. Collaborating with teams to enhance reliability and integrate AI tooling into workflows.

🇺🇸 United States – Remote

💰 Series F on 2021-01

⏰ Full Time

🟠 Senior

⛑ DevOps & Site Reliability Engineer (SRE)

🦅 H1B Visa Sponsor

AWS

Azure

Cloud

Google Cloud Platform

Jenkins

Kubernetes

Linux

Python

Terraform