Senior Data Engineer – AWS, RAG Pipelines

🔥 0 minutes ago

🇨🇴 Colombia – Remote

⏰ Full Time

🟠 Senior

🚰 Data Engineer

Apply Now
Find Similar Remote Jobs

📊 Check your resume score for this job

Improve your chances of getting an interview by checking your resume score before you apply.

Logo of Jalasoft

Jalasoft

1001 - 5000 employees

Founded 2003

☁️ SaaS

📚 Education

Software Development • SaaS • Education

Jalasoft is a global nearshore software development company with a strong presence across 70 cities in 13 countries. With a team of over 1000 South American-based software engineers, Jalasoft specializes in software development, quality assurance (QA), and DevOps solutions. The company focuses on staff augmentation and dedicated teams tailored to meet client needs, ensuring quality and efficiency in project delivery. Jalasoft places a strong emphasis on security, holding an ISO 27001 certification, and partners with leading technology firms like Palo Alto, NVIDIA, and Cisco to offer reliable network and data center management. In addition, Jalasoft operates Jala University, offering educational programs in technology to foster and recruit top tech talent. The company aims to drive digital transformation by providing agile, culturally aligned nearshore software solutions.

📋 Description

• Design and operate the cloud data infrastructure powering AI initiatives. • Architect production-scale data lakes on AWS. • Build real-time ingestion and observability pipelines. • Own the vector search and embedding layers that feed RAG systems and autonomous agents.

🎯 Requirements

• Overall Experience: 7+ years in Data Engineering, Distributed Systems, or Data Architecture • AWS & Infrastructure: 4+ years architecting production-scale data lakes, storage tiers, and event streaming • AI/LLM Pipelines: 2+ years building RAG systems, managing embeddings, and orchestrating foundational models • Proficiency in AWS Data Lake Architecture & Storage • Proficiency in Real-Time Observability & Log Analytics • Proficiency in Elasticsearch & OpenSearch Optimization, Vectorization, Embeddings • Proficiency in Amazon Bedrock & Generative AI Pipelines • Proficiency in Software Engineering & API Ingestion • Production-level proficiency in one or more of: C# (.NET Core), Java, Python, or Node.js • AWS S3 partitioning strategies, lifecycle policies, and columnar formats (Parquet, Iceberg) • AWS Glue Data Catalog and Lake Formation for multi-tenant, fine-grained access control • Query optimization over petabyte-scale datasets using Amazon Athena and Redshift Spectrum • Distributed oTel collector configuration for log, trace, and metrics capture and routing into S3 • High-volume streaming of system logs, Datadog captures, and raw server events into S3 • Real-time CDC from PostgreSQL using Debezium or AWS DMS • Amazon OpenSearch clusters with simultaneous lexical and high-dimensional vector search • OpenSearch index lifecycle management, sharding strategies, and dynamic mappings at scale • Amazon Bedrock foundational model APIs (Claude, Titan) for data enrichment, classification, and semantic parsing • Knowledge Bases for Amazon Bedrock for automatic chunking, metadata extraction, and vector index syncs from S3 • ETL/ELT pipelines ingesting unstructured event data from SaaS APIs (e.g., Pendo, Hotjar, Google Analytics) • MCP server development to expose data lake context and utilities to AI agents

🏖️ Benefits

• Remote work. • 13 floating holiday. • 15 vacation days per year completed. • Good working environment.

Apply Now

Similar Jobs

🔥 34 minutes ago

Aimpoint Digital

51 - 200

🤖 Artificial Intelligence

Senior Data Engineer at Aimpoint Digital designing end-to-end analytical solutions across industries. Working independently to solve complex data engineering use-cases and support data analytics efforts.

🇨🇴 Colombia – Remote

⏰ Full Time

🟠 Senior

🚰 Data Engineer

Amazon Redshift

Apache

AWS

Azure

BigQuery

Cloud

Docker

ETL

Google Cloud Platform

Informatica

Java

Kubernetes

Matillion

Python

Scala

Spark

SQL

🔥 2 hours ago

Truelogic Software

501 - 1000

☁️ SaaS

🤝 B2B

🏢 Enterprise

Senior Data Engineer at Truelogic, enhancing data platform and analytics infrastructure for a B2B marketplace. Collaborating with teams to drive operational and financial insights from complex data.

🇨🇴 Colombia – Remote

⏰ Full Time

🟠 Senior

🚰 Data Engineer

Airflow

Amazon Redshift

Apache

AWS

Cloud

Python

SQL

🔥 18 hours ago

Blend360

501 - 1000

🤖 Artificial Intelligence

🏢 Enterprise

Senior Data Engineer developing and maintaining data solutions for a high-impact enterprise initiative at Blend, a leading AI services provider. Collaborating with teams to ensure accurate and scalable data pipelines.

🇨🇴 Colombia – Remote

💰 $100M Private Equity Round on 2022-08

⏰ Full Time

🟠 Senior

🚰 Data Engineer

ETL

Python

SQL

🕒 Yesterday

Blend360

501 - 1000

🤖 Artificial Intelligence

🏢 Enterprise

Lead Data Engineering Manager at Blend, focused on designing scalable solutions for Journey Analytics. Overseeing data platforms and leading a team of engineers to deliver high-quality datasets.

🇨🇴 Colombia – Remote

💰 $100M Private Equity Round on 2022-08

⏰ Full Time

🟠 Senior

🔴 Lead

🚰 Data Engineer

🕒 Yesterday

SuperStaff

201 - 500

🤝 B2B

🛍️ eCommerce

🎯 Recruiter

Data Engineer responsible for building data pipelines and configuring customer instances for B2B sales intelligence. Collaborating with Customer Success to deliver data solutions.

🇨🇴 Colombia – Remote

💵 $6M / month

⏰ Full Time

🟡 Mid-level

🟠 Senior

🚰 Data Engineer

Airflow

Apache

Cloud

ETL

Google Cloud Platform

Python

SQL

Go