Data Architect

🔥 30 minutes ago

Apply Now
Find Similar Remote Jobs

📊 Check your resume score for this job

Improve your chances of getting an interview by checking your resume score before you apply.

Logo of Cummins Inc.

Cummins Inc.

10,000+ employees

Founded 1919

⚡ Energy

🚗 Transport

🔧 Hardware

💰 $75M Grant on 2024-07

Energy • Transport • Hardware

Cummins Inc. is a global power technology leader that designs, manufactures, and distributes a variety of engines and power systems solutions. They offer products that range from diesel and natural gas engines to hybrid and electric power systems, as well as components like turbochargers, fuel systems, and emissions solutions. With a strong emphasis on innovation, Cummins aims to reduce emissions and improve fuel efficiency. The company is dedicated to helping industries navigate the transition to cleaner energy through integrated power solutions suitable for diverse applications such as on-highway, marine, mining, and construction. Additionally, Cummins provides services including remote monitoring, diagnostics, and aftermarket support, reinforcing its commitment to sustainability and customer service excellence.

📋 Description

• Design and automate scalable data ingestion and transformation pipelines across relational, event-based, and unstructured sources. • Build and maintain frameworks to monitor, detect, and resolve data quality and integrity issues. Implement data governance practices, including metadata management, data access, and retention policies. • Architect and guide development of reliable, efficient, and scalable ETL/ELT data pipelines with monitoring and alerting. • Design physical data models and optimize database structures, indexing, and relationships for performance. • Test, optimize, and troubleshoot data pipelines to ensure stability and performance. • Develop and manage large-scale data storage solutions using distributed and cloud platforms (e.g., data lakes, Hadoop, NoSQL databases). • Drive automation and modernization of data infrastructure and integration processes to support agile analytics initiatives.

🎯 Requirements

• College, university, or equivalent degree in relevant technical discipline, or relevant equivalent experience required. • This position may require licensing for compliance with export controls or sanctions regulations. • Intermediate experience in a relevant discipline area is required. Knowledge of the latest technologies and trends in data engineering are highly preferred and includes: • Familiarity analyzing complex business systems, industry requirements, and/or data regulations • Background in processing and managing large data sets • Design and development for a Big Data platform using open source and third-party tools • SPARK, Scala/Java, Map-Reduce, Hive, Hbase, and Kafka or equivalent college coursework • SQL query language • Clustered compute cloud-based implementation experience • Experience developing applications requiring large file movement for a Cloud-based environment and other data extraction tools and methods from a variety of sources • Experience in building analytical solutions • Intermediate experiences in the following are preferred: • Experience with IoT technology • Experience in Agile software development • Dimensional Modeling Mastery — Deep expertise in designing enterprise‑scale dimensional models (star, snowflake, constellation) with strong command of fact table grain definition, surrogate key strategies, slowly changing dimensions (Types 1–6), bridge tables, and late‑arriving data handling. • Advanced SQL Engineering — Highly proficient in writing complex, high‑performance SQL, including window functions, CTE‑driven transformations, query plan analysis, cost‑based optimization, partitioning strategies, and performance tuning across large, distributed datasets. • Snowflake Architecture & Engineering — Hands‑on experience with Snowflake internals including micro‑partitioning, clustering keys, result‑set caching layers, warehouse sizing/auto‑suspend tuning, Snowpipe/Streams/Tasks orchestration, Time Travel, Zero‑Copy Cloning, and secure data sharing patterns. • Graph Database & Cypher Proficiency — Strong experience with Neo4j or equivalent graph platforms, including graph schema design, Cypher query optimization, graph algorithms (PageRank, community detection, pathfinding), and integration of graph workloads with analytical and relational systems. • Microsoft Fabric Ecosystem — Practical experience with Fabric Lakehouse architecture, Delta Lake optimization, Data Engineering pipelines, Data Factory orchestration, KQL‑based Real‑Time Analytics, semantic model creation, and integration with Power BI and OneLake governance. • SAP S/4HANA Data Structures — Familiarity of SAP S/4HANA data models (FI/CO, MM, SD, PP), CDS views, OData services, SLT/SDI/ODP‑based extraction patterns, and harmonization of SAP transactional data into cloud‑based analytical platforms. • Cloud Data Architecture — Strong understanding of distributed data processing, ELT/ETL orchestration, event‑driven ingestion (Kafka/Event Hub), metadata‑driven frameworks, schema evolution, and data lifecycle management across cloud environments (Azure preferred). • Data Governance & Metadata Management — Experience implementing enterprise data catalogs, lineage tracking, data quality rules, master data integration, and security models (RBAC/ABAC, row‑level and column‑level security). • Performance Engineering & Optimization — Ability to diagnose bottlenecks across compute, storage, and network layers; optimize workloads for cost and performance; and design scalable, fault‑tolerant data architectures. • Cross‑Platform Integration — Experience integrating heterogeneous systems (SAP, Snowflake, Fabric, graph DBs, APIs, streaming platforms) into unified analytical ecosystems with strong focus on interoperability and data consistency.

Apply Now

Similar Jobs

🔥 1 hour ago

Summer

11 - 50

💳 Fintech

👥 HR Tech

📚 Education

Data & Intelligence Platform Lead architecting Summer’s data solutions for strategic business insights. Leading evolution to an enterprise-grade platform for AI-driven debt optimization.

🔥 1 hour ago

TruStage

1001 - 5000

💸 Finance

💳 Fintech

Data Architect leading comprehensive data architecture frameworks and initiatives in financial solutions. Overseeing enterprise-scale data integrations and providing technical leadership for advanced data practices.

🔥 3 hours ago

ROI Agency

51 - 200

Principal Data Engineer defining enterprise-wide data architecture and platform strategy at ROI Agency. Leading modernization and architectural decisions while mentoring senior engineers.

🔥 3 hours ago

GOBI Technologies, Inc.

11 - 50

🤝 B2B

🏢 Enterprise

☁️ SaaS

Lead Data Architect driving data-driven transformation for clients at GOBI Technologies. Shaping enterprise data strategies and architecting modern data platforms.

🔥 3 hours ago

Datafold

11 - 50

Forward Deployed Data Engineer leading AI-automated data migrations at Datafold. Overseeing projects from scoping to execution with a strong focus on customer engagement.

🇺🇸 United States – Remote

💵 $155k - $200k / year

💰 $20M Series A on 2021-11

⏰ Full Time

🟡 Mid-level

🟠 Senior

🚰 Data Engineer