Job Description
Job
Role: Data Engineer (GenAI + Python + SQL) Location:
Remote Hire-Type: Contract No C2C
Required Skills & Qualifications • Experience: 12+ years in Data Engineering, Software Development, or a related field. • Programming: Expert-level proficiency in Python and SQL. Proficiency in Scala or JavaScript is a plus. • Big Data Tech: Deep hands-on experience with Apache Spark, Databricks, and Cloud Data Platforms (Azure/AWS). • GenAI Stack: Proven experience with LLM orchestration frameworks (LangChain, LlamaIndex) and Vector Databases.
• API & Backend: Strong background in building RESTful APIs with Flask, FastAPI, or Django. • Containerization & DevOps: Familiarity with Docker, Kubernetes, Terraform, and bolthires/CD pipelines. • Databases: Experience with both SQL (Postgres, MySQL) and NoSQL (MongoDB, Cassandra, Redis) systems. Data Engineering & Infrastructure • Scale Data Pipelines: Design and maintain robust ETL/ELT pipelines using Apache Spark, Databricks, and Azure Data Factory (ADF) to handle high-volume batch and streaming data.
• Real-Time Processing: Implement event-driven architectures using Kafka for real-time data ingestion and analytics. • Data Lake Architecture: Oversee the organization and optimization of data lakes (ADLS, S3) and warehouses (Snowflake, BigQuery, or bolthires Lake). • Performance Tuning: optimize Spark jobs and SQL queries to reduce latency and infrastructure costs. Backend Development & APIs • High-Performance APIs: Build and scale stateless microservices using Python (Flask/FastAPI) to handle high concurrency (1,000+ QPS).
• Security & Authentication: Implement robust security layers including JWT-based authorization, rate-limiting, and encryption protocols. • Full-Stack Integration: Collaborate with frontend teams (React) to deliver seamless data visualizations (D3.js) and user dashboards. Apply tot his job