Job Description
About the position
The
Senior MLOps Engineer will support the Integrated Business Planning program by designing, implementing, and maintaining scalable machine learning infrastructure. This role involves deploying ML models into Kubernetes, collaborating with data scientists and IT teams, and optimizing workflows to ensure system reliability and scalability. The engineer will also provide technical guidance on model optimization and deployment, develop and test machine learning models, and manage Kubernetes clusters for deploying scalable ML applications.
Responsibilities • Design, implement, and maintain scalable ML infrastructure. • Deploy ML models into Kubernetes environments. • Collaborate with data scientists to understand model requirements. • Provide technical guidance on model optimization and deployment. • Develop, test, and deploy machine learning models using appropriate frameworks. • Research and prototype the latest machine learning platform technologies. • Manage Kubernetes clusters for deploying scalable ML models and applications. • Implement Kubernetes Operators for managing ML workflows and resources.
• Optimize resource utilization and ensure high availability of ML services on Kubernetes. • Document processes, workflows, and infrastructure setups for transparency. Requirements • 6+ years of experience in machine learning engineering, with at least 3 years in a DevOps environment. • Proven experience in deploying and managing ML models in production using Kubernetes. • Strong programming skills in Python, with experience in ML libraries such as TensorFlow, PyTorch, or scikit-learn. • Experience with MLOps tools such as Kubeflow, MLflow, or TFX.
• Proficient in DevOps tools and practices including Docker, Jenkins, and Git. • Extensive experience with Kubernetes for container orchestration and management. • Hands-on experience with building ADF pipelines on Azure Databricks. • Excellent problem-solving and analytical skills; strong communication and teamwork abilities. • Bachelor's or Master's degree in Computer Science, Engineering, or a related field. Nice-to-haves • Experience with supply chain data domain. • Familiarity with monitoring tools such as Prometheus and Grafana.
• Knowledge of infrastructure-as-code tools like Terraform or Ansible. • Understanding of data engineering concepts and tools. Apply tot his job