Job Description
Job Description :
We are seeking an experienced Big Data Engineer (EL3 Level) with strong expertise in Apache Spark and Scala to design, develop, and optimize large-scale data processing solutions in the Healthcare domain. The ideal candidate will work on building scalable data pipelines, integrating diverse healthcare datasets (claims, EMR/EHR, provider, member data), and enabling analytics and reporting solutions while ensuring HIPAA compliance and data security.
- This is a fully remote opportunity supporting enterprise healthcare data platforms.Key ResponsibilitiesBig Data Engineering
- Design and develop distributed data processing pipelines using Apache Spark with Scala.
- Build batch and real-time data pipelines using Spark Core, Spark SQL, and Spark Streaming.
- Optimize Spark jobs for performance tuning (partitioning, caching, broadcast joins, memory management).
- Healthcare Data Integration
- Process and transform healthcare datasets including:
- Claims data (837/835)
- EHR/EMR data
- Member & Provider data
- HL7/FHIR formats
- Ensure data quality, validation, and compliance with healthcare regulations (HIPAA).
- Cloud & Data Platform
- Work on cloud-based big data platforms (AWS/Azure/Google Cloud Platform).
- Develop data pipelines using:
- Data lakes (S3/ADLSS)
- Hive/Delta Lake/Iceberg
- Kafka for streaming
- Implement CI/CD for data pipelines.
- Data Modeling & Warehousing
- Design scalable data models (star/snowflake schema).
- Implement ETL/ELT frameworks.
- Support analytics and reporting teams with optimized datasets.
- Governance & Security
- Implement data masking, encryption, and PHI protection strategies.
- Collaborate with compliance teams to ensure regulatory standards.
- Required Qualifications
- 12+ years of IT experience in Big Data Engineering.
- Strong hands-on experience in:
- Apache Spark
- Scala
- Experience with distributed data processing and cluster tuning.
- Strong SQL knowledge and data modeling experience.
- Experience working with healthcare datasets.
- Experience with:
- Hadoop ecosystem (Hive, HDFS)
- Kafka
- Airflow
- Cloud experience (AWS/Azure/Google Cloud Platform).
- Preferred Qualifications
- Experience with:
- Databricks
- Delta Lake
- Experience implementing Data Lakehouse architecture.
- Knowledge of FHIR/HL7 standards.
- Experience in real-time healthcare analytics.
- Certification in AWS/Azure Data Engineering.
Apply tot his job
Apply To this Job