Job Description
- Requirements
- 10+ years of industry experience spanning machine learning engineering and distributed systems,
- 3+ years of leadership and management experience, with a proven ability to build and lead strong technical teams,
- MSc or Ph.D. in Computer Science, Machine Learning, or related field, or equivalent practical experience,
- Proven expertise in building and deploying end-to-end ML systems at scale, including recommendation and personalization systems,
- Strong background in distributed systems architecture, including low-latency services, streaming platforms, and large-scale serving,
- Hands-on experience with deep learning frameworks (e.g., TensorFlow, PyTorch) and ML infrastructure technologies,
- Track record of delivering high-quality, scalable, and fault-tolerant systems,
- Excellent communication skills and ability to influence product and technical strategy,
- Proven experience deploying large-scale serving systems on AWS and demonstrated expertise in leveraging Databricks for large-scale data processing and ML workflows
- What the job involves
- We are seeking a Director of Machine Learning Engineering and Infrastructure to lead a hybrid team bridging advanced ML engineering with world-class infrastructure design,
- In this role, you will own the strategic direction and execution for scaling our machine learning capabilities while ensuring our distributed systems and infrastructure can support innovation at massive scale,
- You will combine technical depth with leadership excellence to guide teams that deliver both foundational ML systems and high-performance distributed services,
- Lead and manage high-performing teams across ML engineering and ML infrastructure, fostering a culture of innovation, collaboration, and growth,
- Define and execute the strategic roadmap for ML systems, including recommendation, personalization, and ads optimization,
- Oversee the design, development, and deployment of scalable ML pipelines: data ingestion, feature engineering, model training, evaluation, and serving,
- Architect distributed systems to support ML workloads at scale, ensuring reliability, observability, and operational excellence,
- Partner closely with Product, Engineering, and Content teams to align on business goals and deliver impactful ML-driven experiences,
- Support best practices in experimentation, evaluation, and ML system monitoring,
- Ensure cost efficiency, scalability, and performance in ML infrastructure investments
Apply tot his job
Apply To this Job