Job Description
- Job Description:
- Develop algorithms for AI/DL, data analytics, machine learning, or scientific computing
- Contribute and advance open source NeMo-RL , Megatron Core , NeMo Framework
- Solve large-scale, end-to-end AI training and inference challenges, spanning the full model lifecycle from initial orchestration, data pre-processing, running of model training and tuning, to model deployment.
- Work at the intersection of computer-architecture, libraries, frameworks, AI applications and the entire software stack.
- Innovate and improve model architectures, distributed training algorithms, and model parallel paradigms.
- Performance tuning and optimizations, model training and finetuning with mixed precision recipes on next-gen NVIDIA GPU architectures.
- Research, prototype, and develop robust and scalable AI tools and pipelines.
- Requirements:
- MS, PhD or equivalent experience in Computer Science, AI, Applied Math, or related fields.
- 5+ years of industry experience.
- Experience with AI Frameworks (e.g. PyTorch, JAX, Ray), and/or inference and deployment environments (e.g. TRTLLM, vLLM, SGLang).
- Proficient in Python programming, software design, debugging, performance analysis, test design and documentation.
- Consistent record of working effectively across multiple engineering initiatives and improving AI libraries with new innovations.
- Strong understanding of AI/Deep-Learning fundamentals and their practical applications.
- Benefits:
- You will also be eligible for equity and benefits
Apply tot his job
Apply To this Job