April 18
Loading...
• Build and manage a data lake in AWS leveraging and augmenting existing LakeFormation based architecture • Build and maintain data pipelines from a variety of data sources, including streaming datasets, APIs, and various data stores, leveraging PySpark and AWS Glue • Create data sets from the data lake to support various use cases, such as business analytics, dashboards, reports, and machine learning • Drive technical decisions on the best ways to serve data consumers (data warehouse vs. search engine databases, etc.) • Leverage existing AWS architectures and design new ones where needed, using the CDK toolkit • Operationalize data workloads in AWS, automating pipelines and implementing appropriate monitoring • Work with cross-functional teams to discover business needs and design appropriate data flows
• Bachelor’s degree • Minimum 3 years of experience developing data solutions in a modern cloud environment • Fluency in Python • Experience authoring and maintaining ETL jobs (PySpark experience a plus) • Experience designing and interacting with relational and non-relational data stores • Experience with AWS ecosystem and resources and using Infrastructure-as-code methodologies (CDK a plus) • Demonstrated ability to manage production data workloads (detecting and diagnosing issues, monitoring) • Passion for excellence and making the world a better place!
• 11 paid holidays • Generous Accrued Time Off increasing with years of service • Generous paid sick time • Annual day of service
Apply NowApril 12
April 12
11 - 50
🇺🇸 United States – Remote
💰 $100M Venture Round on 2019-09
⏰ Full Time
🟡 Mid-level
🟠 Senior
🚰 Data Engineer