Search Remote Jobs

Mid-Level Data Engineer

Job not on LinkedIn

đŸ”„ 24 minutes ago

Apply Now
Find Similar Remote Jobs

📊 Check your resume score for this job

Improve your chances of getting an interview by checking your resume score before you apply.

Logo of Simple Technology Solutions

Simple Technology Solutions

51 - 200 employees

đŸ›ïž Government

đŸ€– Artificial Intelligence

Government ‱ Cloud ‱ Artificial Intelligence

Simple Technology Solutions is a HUBZone small business that specializes in IT modernization and digital experience for government operations. They focus on digitalizing government processes using cloud-native technologies and Agile practices to deliver full-stack digital solutions. The company emphasizes security, scalability, and interoperability in its enterprise approach. They work on enhancing cloud environments, migrating legacy IT systems, and promoting DevSecOps practices. Additionally, Simple Technology Solutions develops enterprise data management strategies using machine learning and AI, modernizes applications, and provides cloud contact center services. They primarily serve federal government agencies, particularly in law enforcement and public safety missions.

📋 Description

‱ Develop new ETL pipelines and data ingestion processes alongside senior engineers using AWS Glue (Spark-based, PySpark), MWAA (Airflow), Lambda, and SNS ‱ Integrate the agency's ETL Common Library into Glue jobs for standardized orchestration, error handling, metadata recording, and SNS notifications for all success and error job events ‱ Ingest structured and semi-structured datasets (CSV, XML, JSON, Avro, pipe-delimited) into S3 landing, raw, and curated zones using Apache Iceberg tables ‱ Configure static ETL metadata in the centralized PostgreSQL metadata store; ensure dynamic metadata records job status and timestamps for all key execution steps ‱ Monitor assigned production jobs and participate in operations support rotations ‱ Ensure ETL Load Reports are populated in real-time and ETL Gap Reports are updated on a weekly basis ‱ Build and maintain materialized views and semantic layer objects in Trino and Athena to ensure optimized query performance and consistent business logic ‱ Produce and maintain required documentation for each assigned dataset: Business Requirements, ETL Design Documents, Data Models, Data Dictionaries, Mapping Documents, Deployment Documents, O&M Guides, and ETL Test Plans ‱ Write unit and integration tests achieving the 90% minimum code coverage threshold; complete security scans at least once per sprint ‱ Deploy ETL resources using CloudFormation templates through the agency CICD pipeline ‱ Support transition of ETL jobs from other agency teams and disaster recovery exercises

🎯 Requirements

‱ US Citizenship is required ‱ Bachelor's Degree is required ‱ minimum of 3-5 years' position related experience is required ‱ Hands-on experience with Python (PEP 8), PySpark, and SQL for ETL pipeline development ‱ Experience with AWS services including Glue, S3, MWAA (Airflow), Lambda, SNS, and SQS ‱ Familiarity with Apache Iceberg, Parquet, and ORC file formats and S3 data lake zone concepts ‱ Experience with PostgreSQL and basic familiarity with Redshift or Oracle ‱ Familiarity with Trino or Athena for query and semantic layer development ‱ Experience with CloudFormation, GitHub branching workflows, and CI/CD-integrated deployments ‱ Ability to produce clear ETL documentation including data models (Mermaid format) and data dictionaries ‱ Understanding of ETL metadata concepts including static and dynamic metadata, load reports, and gap reports ‱ Experience in agile development environments with sprint-based delivery ‱ Experience supporting IV&V and/or User Acceptance Testing (UAT) processes in a federal or technical program environment ‱ Experience with automated testing frameworks; ability to write unit and integration tests achieving defined code coverage thresholds ‱ Familiarity with FISMA, NIST 800-53, and OWASP ASVS Level 2 is a plus ‱ Must be able to work 8am-5pm Eastern Time regardless of home location ‱ Active federal public trust suitability determination or ability to obtain one required

đŸ–ïž Benefits

‱ Flexible work arrangements ‱ Continuous learning ‱ Professional development ‱ Special incentives for team members living in qualified HUBZones

Apply Now

Similar Jobs

đŸ”„ 1 hour ago

Samsara

1001 - 5000

🏱 Enterprise

🚗 Transport

🔐 Security

Senior Data Engineer developing scalable data pipelines for IoT systems at Samsara. Designing data models and collaborating with cross-functional teams to enhance data analysis efficiency.

đŸ”„ 4 hours ago

AssistRx

501 - 1000

⚕ Healthcare Insurance

💊 Pharmaceuticals

☁ SaaS

Senior Manager leading teams in data engineering for scalable data solutions at AssistRx. Engaging with stakeholders to ensure successful project delivery and team development.

đŸ”„ 5 hours ago

Sardine

51 - 200

🔒 Cybersecurity

📋 Compliance

💳 Fintech

Data Engineer building and owning internal data infrastructure for analytics at Sardine. Integrating systems into a scalable data warehouse to drive decision-making and insights.

đŸ”„ 13 hours ago

Ochsner Health

10,000+ employees

⚕ Healthcare Insurance

đŸ€ Non-profit

📚 Education

Storage and Data Engineer leading enterprise storage migrations to AWS, Azure, and Rackspace. Ensuring smooth transitions into steady state run operations with strong resilience and compliance.

đŸ”„ 13 hours ago

Ad Hoc LLC

501 - 1000

đŸ›ïž Government

đŸ€– Artificial Intelligence

🔌 API

Senior Data Architect at Ad Hoc collaborating on federal digital services. Leading data architecture strategy and guiding teams in complex data migrations and cloud solutions.