Site Reliability Engineer

🕒 May 27

🇺🇸 United States – Remote

⏰ Full Time

🟡 Mid-level

🟠 Senior

⛑ DevOps & Site Reliability Engineer (SRE)

Apply Now
Find Similar Remote Jobs

📊 Check your resume score for this job

Improve your chances of getting an interview by checking your resume score before you apply.

Logo of OXIO

OXIO

51 - 200 employees

📡 Telecommunications

☁️ SaaS

💳 Fintech

Telecommunications • SaaS • Fintech

OXIO is a telecom company offering a cloud-native, programmable Telecom-as-a-Service platform that empowers businesses to launch and manage their own mobile networks. The company's platform, BrandVNO, enables the creation of custom mobile connectivity services without telecom expertise, allowing organizations to integrate these services seamlessly into their existing operations. OXIO provides tools for business intelligence, subscriber management, and personalized customer experiences, targeting retail, fintech, and various enterprise sectors. The company's API-first approach facilitates rapid innovation, cost reduction, and enhanced customer engagement across multiple markets.

📋 Description

• Design and implement platform on the cloud to support OXIO backend services • Automate technical operations: deployments, scaling, recovery, etc. • Monitor and maintain mission-critical production infrastructure to ensure maximum uptime • Participate in an on-call rotation and culture of continuous improvement through blameless postmortems • Enable the Engineering/Telecom/Data Engineering teams by providing them the tools to operate the service they build

🎯 Requirements

• Understanding of Linux/Unix systems (most systems are Linux-based). • Familiarity with Linux/Unix system internals like process management, filesystems, memory management, and networking. • Proficiency in at least one programming language (Python, Go, or Ruby) and strong skills in scripting (Bash, Perl). • Experience with infrastructure provisioning tools such as Terraform, CloudFormation, or Ansible. • Familiarity with containerization (Docker) and orchestration tools (Kubernetes). • Familiarity with monitoring tools like Prometheus, Grafana, or Datadog. • Knowledge of setting up alerts, analyzing logs, and creating dashboards for observability. • Familiarity with incident management practices (e.g., runbooks, postmortems). • Experience in being part of an on-call rotation and handling incidents. • Experience in setting up and maintaining Continuous Integration/Continuous Delivery pipelines (Jenkins, GitLab CI, CircleCI, etc.). • Hands-on experience with cloud providers (AWS, Google Cloud, Azure). • Knowledge of virtualization technologies (VMware, KVM) and cloud-native architecture. • Understanding of TCP/IP, DNS, HTTP/HTTPS, load balancing, and firewalls. • Strong understanding of deployment strategies (canary releases, blue-green deployments, etc.). • Familiarity with high availability and understanding failover mechanisms. • Familiarity with IAM (Identity and Access Management) and zero trust principles. • Experience working with distributed systems (e.g., Kafka, Cassandra, Elasticsearch). • Building custom monitoring tools or writing complex automation scripts. • Functional knowledge of database management (SQL and NoSQL). • Familiarity with distributed tracing (Jaeger, OpenTelemetry) and advanced log aggregation strategies (ELK stack, Splunk). • Familiarity with performance profiling tools and optimizing application performance under heavy load. • Familiarity in load testing and identifying bottlenecks. • Familiarity with Configuration Management using SaltStack for maintaining server configurations.

🏖️ Benefits

• N/A

Apply Now

Similar Jobs

🕒 May 27

Cority

201 - 500

☁️ SaaS

📋 Compliance

Intermediate Site Reliability Engineer supporting reliability, performance, and scalability of cloud-hosted services. Collaborate with engineering teams and contribute to incident response processes.

🇺🇸 United States – Remote

⏰ Full Time

🟡 Mid-level

🟠 Senior

⛑ DevOps & Site Reliability Engineer (SRE)

🕒 May 27

General Motors

10,000+ employees

🚗 Transport

⚡ Energy

🏢 Enterprise

Design Release Engineer focusing on semiconductor product development and engineering processes at GM. Involves collaboration with teams to uphold strategic vision and core values of GM.

🇺🇸 United States – Remote

💵 $124.7k - $161.1k / year

💰 $500M Grant on 2024-07

⏰ Full Time

🟡 Mid-level

🟠 Senior

⛑ DevOps & Site Reliability Engineer (SRE)

🦅 H1B Visa Sponsor

info

🕒 May 27

Order.co

51 - 200

☁️ SaaS

💳 Fintech

🤝 B2B

Senior Site Reliability Engineer at Order.co to ensure reliable and scalable software systems. Collaborate with the Platform team while maintaining operational efficiency and infrastructure excellence.

🇺🇸 United States – Remote

💵 $175k - $200k / year

💰 $30M Series B - Order on 2022-01

⏰ Full Time

🟠 Senior

⛑ DevOps & Site Reliability Engineer (SRE)

🕒 May 26

VetsEZ

201 - 500

🤝 B2B

☁️ SaaS

🏛️ Government

DevSecOps Engineer supporting secure software delivery and cloud infrastructure operations for federal government healthcare projects. Collaborating with teams to improve deployment reliability and efficiency.

🇺🇸 United States – Remote

⏰ Full Time

🟡 Mid-level

🟠 Senior

⛑ DevOps & Site Reliability Engineer (SRE)

🕒 May 26

VetsEZ

201 - 500

🤝 B2B

☁️ SaaS

🏛️ Government

DevSecOps Engineer for federal healthcare technology initiative, collaborating on secure software delivery and automation. Focusing on CI/CD, cloud infrastructure, and deployment efficiency.

🇺🇸 United States – Remote

⏰ Full Time

🟡 Mid-level

🟠 Senior

⛑ DevOps & Site Reliability Engineer (SRE)