
51 - 200 employees
Founded 2018
🤖 Artificial Intelligence
🤝 B2B
🏢 Enterprise
Artificial Intelligence • B2B • Enterprise
Marvik is a technology consultancy that designs, builds, and deploys production-ready artificial intelligence solutions for enterprise customers. They offer end-to-end AI services including strategy and opportunity discovery, data engineering, model development (agents, LLMs, generative AI, computer vision, predictive analytics), robotics and automation, and on-demand senior AI talent and leadership (fractional CAIO). Marvik focuses on delivering scalable AI that drives business impact across industries like retail, e-commerce, logistics, fintech, manufacturing, healthcare, energy, and government.
🔥 0 minutes ago
Improve your chances of getting an interview by checking your resume score before you apply.

51 - 200 employees
Founded 2018
🤖 Artificial Intelligence
🤝 B2B
🏢 Enterprise
Artificial Intelligence • B2B • Enterprise
Marvik is a technology consultancy that designs, builds, and deploys production-ready artificial intelligence solutions for enterprise customers. They offer end-to-end AI services including strategy and opportunity discovery, data engineering, model development (agents, LLMs, generative AI, computer vision, predictive analytics), robotics and automation, and on-demand senior AI talent and leadership (fractional CAIO). Marvik focuses on delivering scalable AI that drives business impact across industries like retail, e-commerce, logistics, fintech, manufacturing, healthcare, energy, and government.
• Build and operate ingestion, ELT/ETL, and orchestration pipelines that move data from our MongoDB Atlas operational store and other sources into our analytical and AI-serving layers • Implement layered (medallion-style) transformations with idempotent, backfillable, incrementally loaded jobs • Apply deduplication, normalization, and validation so downstream data is high-quality and trustworthy • Modernize legacy / homegrown data flows via incremental, strangler-fig migrations that keep production stable • Build embeddings and vector pipelines, and the feature/retrieval-ready datasets that RAG, semantic search, and agentic workloads depend on • Make production data AI-ready in practice: well-structured, lineage-tracked, and retrieval-friendly, in partnership with ML and application engineering • Implement real-time and change-data-capture flows from MongoDB (Change Streams / CDC) where workloads require fresh data • Implement the canonical data model, schemas, and data contracts defined by the Data Architect — enforced in-repo so other teams build against stable definitions • Exercise sound persistence judgment in execution: land data in the right store (document / NoSQL, vector, analytical) per the architectural direction • Contribute to build-vs-buy decisions by prototyping with proven, industry-standard tooling over custom development • Establish testing, data-quality, and lineage checks for the pipelines you own, with clear alerting and runbooks • Instrument pipeline observability (freshness, volume, schema-drift, cost) so failures are caught before consumers feel them • Use AI-assisted development tools (Claude Code, Copilot, Cursor) as a force multiplier for transformation logic, query tuning, and migration scripting • Partner with database engineering on extracting from and protecting the production store • Partner with the Data Architect on implementing target-state patterns and surfacing what's hard to build • Partner with ML, AI, and application engineers on the data they consume — shaping and governing it so it's safe and ready to build on
• 5+ years of hands-on data engineering experience building and operating production data pipelines at scale • Strong programming and data skills: Python and SQL, with solid software-engineering fundamentals (version control, testing, CI) — shipping and maintaining production code, not just notebooks • Hands-on MongoDB at production scale (Atlas ideal): document modeling, aggregation framework, change streams / CDC, and extracting from a document store into analytical / AI-serving layers. • Demonstrated experience with ELT/ETL pipeline design, transformation frameworks (dbt or equivalent), and orchestration (Airflow, Dagster, or Azure Data Factory) • Experience building on cloud-native data platforms and lake / lakehouse / warehouse architectures, with layered (medallion-style) modeling • Hands-on experience preparing data for AI/ML or analytical consumers — embeddings / vector pipelines, RAG-/feature-ready datasets, or equivalent — including deduplication, normalization, and validation • Familiarity with vector search and embeddings in production (MongoDB Atlas Vector Search or equivalent) • Demonstrated use of AI-assisted development tools (Claude Code, Copilot, Cursor) for data and pipeline work • Strong grasp of data quality, testing, lineage, and pipeline observability practices • Comfortable working in a complex, specialized domain. MEP / AEC / construction experience is a plus; appetite to learn the domain is required
• Competitive salary • Flexible working hours • Professional development budget • Home office setup allowance • Global team events
Apply Now🕒 June 11
Senior Data Engineer responsible for enterprise data platform architecture, scaling, and governance in a leading B2B marketplace company.
Airflow
Amazon Redshift
Apache
AWS
Cloud
Python
SQL
🕒 May 26
Data Engineer Analyst designing and implementing scalable data solutions for AI services provider. Collaborating with teams to ensure high-quality, reliable data pipelines.
🇺🇾 Uruguay – Remote
💰 $100M Private Equity Round on 2022-08
⏰ Full Time
🟡 Mid-level
🟠 Senior
🚰 Data Engineer
AWS
Cloud
PySpark
Python