Senior Platform Engineer – AI Agent Infrastructure

🕒 April 23

Apply Now
Find Similar Remote Jobs

📊 Check your resume score for this job

Improve your chances of getting an interview by checking your resume score before you apply.

Logo of Yuno

Yuno

11 - 50 employees

💳 Fintech

🏢 Enterprise

☁️ SaaS

Fintech • Enterprise • SaaS

Yuno is a company that provides payment orchestration and infrastructure solutions on a global scale. Their technology empowers businesses to integrate over 300 payment methods, boosting acceptance rates and enabling seamless scaling of payment operations across multiple regions. Yuno focuses on providing a simplified payment process with features like smart routing, unified payment insights, auto reconciliation, and custom checkout options. They emphasize security and fraud management, ensuring safe transactions. Yuno facilitates global payouts and subscription management, making them an ideal partner for businesses looking to optimize their payment systems and increase revenue.

📋 Description

• Designing event-driven communication. • Improving streaming reliability. • Building observability for the platform. • Driving architectural decisions. • Owning cloud infrastructure and automation with IaC. • Building monitoring, tracing, and alerting systems.

🎯 Requirements

• Event-driven architecture and messaging systems — you've designed systems around message queues (Kafka, NATS, RabbitMQ, or similar). You understand at-least-once delivery, consumer groups, dead letters, backpressure, and ideally have migrated a system from synchronous to async messaging. • AWS — deep experience with EC2, VPC, IAM, S3, RDS. You understand networking because inter-service communication runs over internal VPC. • Databases — solid knowledge of both SQL (PostgreSQL) and NoSQL (MongoDB, Redis). You understand when to use each, indexing strategies, replication, and performance tuning. • Docker — container lifecycle, resource limits, health checks, bind mounts, multi-stage builds. • Distributed systems debugging — you've debugged async flows and cascading failures across services in production, and can explain what failed and how you fixed it. • Infrastructure as Code — Terraform or Pulumi. You believe infrastructure should be reviewed in PRs, not clicked in consoles. • Observability — Datadog fluency or equivalent (dashboards, monitors, APM, log pipelines, distributed tracing). • Tech Stack — hands-on experience with Go, AWS (EC2, S3, VPC, RDS PostgreSQL), Docker, PostgreSQL, MongoDB, Redis, and Datadog. • AI / MLOps infrastructure — experience running AI workloads in production (model serving, LLM inference, GPU/resource management, agent evaluation and observability tools like LangFuse, LangSmith, Braintrust, MLflow). • Multi-tenant container platforms — experience with platforms that run customer/user workloads in containers (Replit, Railway, Fly.io, or internal PaaS systems). • Kubernetes — you've done the migration from "Docker on bare EC2" to K8s at least once and know what breaks during the transition. • Data pipelines and orchestration — Airflow, Prefect, or similar. Knowledge of data warehouses (Databricks, Snowflake, BigQuery) is a plus.

🏖️ Benefits

• Competitive Compensation. • Remote Work – You can work from everywhere! • Home Office Bonus – A one-time allowance to help you create your ideal home office. • Work Equipment. • Stock Options. • Health Plan wherever you are. • Flexible Days Off. • Language, Professional, and Personal Growth courses.

Apply Now