Job Description
About the position We are seeking a highly skilled and experienced Data Engineer to design, build, and maintain our scalable and robust data infrastructure on a cloud platform. In this pivotal role, you will be instrumental in enhancing our data infrastructure, optimizing data flow, and ensuring data availability. You will be responsible for both the hands-on implementation of data pipelines and the strategic design of our overall data architecture. Seeking a candidate with hands-on experience with AWS services such as AWS Glue, Lambda, bolthires, Step Functions, and Lake, Proficiency in Python and SQL and DevOps/bolthires/CD experience Responsibilities • Design, develop, and maintain scalable data pipelines and ETL processes to support data integration and analytics.
• Collaborate with data architects, modelers and IT team members to help define and evolve the overall cloud-based data architecture strategy, including data warehousing, data lakes, streaming analytics, and data governance frameworks • Collaborate with data scientists, analysts, and other business stakeholders to understand data requirements and deliver solutions. • Optimize and manage data storage solutions (e.g., S3, Snowflake, Redshift) ensuring data quality, integrity, security, and accessibility.
• Implement data quality and validation processes to ensure data accuracy and reliability. • Develop and maintain documentation for data processes, architecture, and workflows. • Monitor and troubleshoot data pipeline performance and resolve issues promptly. • Consulting and Analysis: Meet regularly with defined clients and stakeholders to understand and analyze their processes and needs. Determine requirements to present possible solutions or improvements. • Technology Evaluation: Stay updated with the latest industry trends and technologies to continuously improve data engineering practices.
Requirements • Cloud Expertise: Expert-level proficiency in at least one major cloud platform (AWS, Azure, or GCP) with extensive experience in their respective data services (e.g., AWS S3, Glue, Lambda, Redshift, Kinesis; Azure Data Lake, Data Factory, Synapse, Event Hubs; GCP BigQuery, Dataflow, Pub/Sub, Cloud Storage); experience with AWS data cloud platform preferred • SQL Mastery:
Advanced SQL writing and optimization skills. • Data Warehousing: Deep understanding of data warehousing concepts, Kimball methodology, and various data modeling techniques (dimensional, star/snowflake schemas).
• Big Data Technologies: Experience with big data processing frameworks (e.g., Spark, Hadoop, Flink) is a plus. • Database Systems: Experience with relational and NoSQL databases (e.g., PostgreSQL, MySQL, MongoDB, Cassandra). • DevOps/bolthires/CD: Familiarity with DevOps principles and bolthires/CD pipelines for data solutions. • Hands-on experience with AWS services such as AWS Glue, Lambda, bolthires, Step Functions, and Lake Formation • Proficiency in Python and SQL Nice-to-haves • 4+ years of progressive experience in data engineering, with a significant portion dedicated to cloud-based data platforms.
• ETL/ELT Tools: Hands-on experience with ETL/ELT tools and orchestrators (e.g., Apache Airflow, Azure Data Factory, AWS Glue, dbt). • Data Governance: Understanding of data governance, data quality, and metadata management principles. • AWS
Experience: Ability to evaluate AWS cloud applications, make architecture recommendations; AWS solutions architect certification (Associate or Professional) is a plus • Familiarity with Snowflake • Knowledge of dbt (data build tool) • Strong problem-solving skills, especially in data pipeline troubleshooting and optimization Apply tot his job