Data Engineer - Databricks (Mid Level) - US Citizens Only

🌍 Remote, USA 🎯 Full-time πŸ• Posted Recently

Job Description

  • --- Project requirements mandate role open only for US Citizens. IRS MBI Clearance a plus/ Active Secret or Top Secret a Plus. All candidates will have to go through Clearance process before being able to start on the project.--(No exceptions to this requirement) Job Description β€’ Infobahn Solutions is hiring Databricks Data Engineering professionals in the Washington DC Metro Area for a US Government Federal Project with the Department of Treasury . β€’ The Data Engineers will be part of a Data Migration & Conversion Team on a large DataLake being implemented on AWS Gov Cloud . β€’ Data will be migrated from on premise Main Frame /Legacy database systems using Informatica PowerCenter to the AWS Landing Zone on S3. β€’ Further conversion will be done using Databricks (PySpark) in AWS. β€’ The Data Engineer should have prior Data Migration experience and understand all the intricacies required of developing data integration routines for moving data from multiple source systems to a new target system with a different data model. β€’ The Data Engineer should have experience in converting Oracle PL/SQL and/or Greenplum code to Databricks. β€’ Must have experience - Experience with Data Migrations and Conversion using Databricks . β€’ Experience of using Databricks on AWS and managing a Databricks production system is critical and a must have for the project. What you’ll be doing: β€’ Databricks Environment Setup: Configure and maintain Databricks clusters, ensuring optimal performance and scalability for big data processing and analytics. β€’ ETL (Extract, Transform, Load): Design and implement ETL processes using Databricks notebooks or jobs to process and transform raw data into a usable format for analysis. β€’ Data Lake Integration: Work with data lakes and data storage systems to efficiently manage and access large datasets within the Databricks environment. β€’ Data Processing and Analysis: Develop and optimize Spark jobs for data processing, analysis, and machine learning tasks using Databricks notebooks. β€’ Collaboration: Collaborate with data scientists, data engineers, and other stakeholders to understand business requirements and implement solutions. β€’ Performance Tuning: Identify and address performance bottlenecks in Databricks jobs and clusters to optimize data processing speed and resource utilization. β€’ Security and Compliance: Implement and enforce security measures to protect sensitive data within the Databricks environment, ensuring compliance with relevant regulations. β€’ Documentation: Maintain documentation for Databricks workflows, configurations, and best practices to facilitate knowledge sharing and team collaboration. Skills: β€’ Apache Spark: Strong expertise in Apache Spark, which is the underlying distributed computing engine in Databricks. β€’ Databricks Platform: In-depth knowledge of the Databricks platform, including its features, architecture, and administration. β€’ Programming Languages: Proficiency in languages such as Python or Scala for developing Spark applications within Databricks. β€’ SQL: Strong SQL skills for data manipulation, querying, and analysis within Databricks notebooks. β€’ ETL Tools: Experience with ETL tools and frameworks for efficient data processing and transformation. β€’ Data Lake and Storage: Familiarity with data lakes and storage systems, such as Delta Lake, AWS S3, or Azure Data Lake Storage. β€’ Collaboration and Communication: Effective communication and collaboration skills to work with cross-functional teams and stakeholders. β€’ Problem Solving: Strong problem-solving skills to troubleshoot issues and optimize Databricks workflows. β€’ Version Control: Experience with version control systems (e.g., Git) for managing and tracking changes to Databricks notebooks and code. Role Requirements: β€’ Bachelor/Master’s degree in computer science, Engineering, or related field β€’ 7-8 plus years of development experience on ETL tools (4+ years of Databricks is a must have) β€’ 5+ years of experience as a Databricks Engineer or similar role. β€’ Strong expertise in Apache Spark and hands-on experience with Databricks. β€’ More than 7 years of experience performing data reconciliation, data validation, ETL testing, deploying ETL packages and automating ETL jobs, developing reconciliation reports. β€’ Working knowledge of message-oriented middleware/streaming data technologies such as Kafka, Confluent β€’ Proficiency in programming languages such as Python or Scala for developing Spark applications. β€’ Solid understanding of ETL processes and data modeling concepts. β€’ Experience with data lakes and storage systems, such as Delta Lake, AWS S3, or Azure Data Lake Storage. β€’ Strong SQL skills for data manipulation and analysis. β€’ Good experience in shell scripting, AutoSys β€’ Strong Data Modeling Skills β€’ Strong analytical skills applied to business software solutions maintenance and/or development β€’ Must be able to work with a team to write code, review code, and work on system operations. β€’ Past project experience with Data Conversion and Data Migration β€’ Communicate analysis, results and ideas to key decision makers including business and technical stakeholders. β€’ Experience in developing and deploying data ingestion, processing, and distribution systems with AWS technologies β€’ Experience with using AWS datastores, including RDS Postgres, S3, or DynamoDB β€’ Dev-ops experience using GIT, developing, deploying code to production β€’ Proficient in using AWS Cloud Services for Data Engineering tasks β€’ Proficient in programming in Python/shell or other scripting languages for the purpose of data movement β€’ Eligible for a US Government issued IRS MBI (candidates with active IRS MBIs will be preferred) β€’ Databricks industry certifications - Associate / Professional Level Preferred Qualifications β€’ Cloud Data Migration and Conversion projects β€’ Experience on AWS Job Types: Full-time, Contract Pay: $90,000.00 - $130,000.00 per year Benefits: β€’ Dental insurance β€’ Flexible schedule β€’ Health insurance β€’ Life insurance β€’ Paid time off β€’ Vision insurance Education: β€’ Bachelor's (Preferred) License/Certification: β€’ Databricks Certified Data Engineer Professional (Required) Security clearance: β€’ Secret (Preferred) Work Location: Remote Apply tot his job

Ready to Apply?

Don't miss out on this amazing opportunity!

πŸš€ Apply Now

Similar Jobs

Recent Jobs

You May Also Like