Job Description
Note: The job is a remote job and is open to candidates in USA. Lavendo is a publicly traded company at the forefront of the AI revolution, offering an AI-centric cloud platform. They are seeking a Senior AI/ML Specialist Solutions Architect to design and implement scalable AI solutions for AI-focused customers, working with state-of-the-art technologies and contributing to one of the most powerful commercially available supercomputers.
- Responsibilities
- Architect and optimize distributed training and inference systems for large-scale AI models
- Design and deliver customer-focused solutions that maximize performance and business value
- Lead the transition of ML pipelines from POC to scalable production systems
- Build long-term customer relationships, ensuring satisfaction and alignment with strategic goals
- Create whitepapers, deliver technical presentations, and host webinars to share insights and best practices
- Provide technical leadership and mentor teams on AI infrastructure and deployment strategies
- Collaborate with engineering and product teams to prioritize customer feedback and influence product roadmaps
- Skills
- 5+ years of experience with cloud technologies and infrastructure, ideally in senior MLOps or Solutions Architect roles
- Proven expertise in scaling and optimizing AI workloads across multi-node and multi-GPU environments
- Demonstrated success delivering ML products, scaling from POC to production
- Deep knowledge of ML frameworks like PyTorch and JAX
- Strong background in the NVIDIA HPC ecosystem (CUDA, NCCL, Infiniband)
- Exceptional communication skills to engage both technical teams and business stakeholders
- Legal authorization to work in the United States on a full-time basis without sponsorship
- Programming Languages: Python, Go, Java, C++
- Infrastructure as Code (IaC): Terraform, Ansible
- Orchestration: Kubernetes (K8s), Slurm
- DevOps Tools: Git, Docker, Helm
- Big Data Frameworks: Spark, Kafka, Hadoop
- Databases: SQL, NoSQL, and vector databases
- ML Frameworks: PyTorch, TensorFlow, JAX, HuggingFace, Scikit-learn
- Benefits
- Full medical benefits: 100% company-paid medical, dental, and vision coverage for employees and families
- 401(k) plan with a 4% match program
- Stock options plan
- Flexible remote work environment
- Company-paid short-term, long-term disability, and life insurance coverage
- 20 weeks paid parental leave for primary caregivers, 12 weeks for secondary caregivers
- Up to $85/month for mobile and internet
- Company Overview
- Sales recruiting for startups in the United States It was founded in 2021, and is headquartered in San Francisco, California, USA, with a workforce of 2-10 employees. Its website is https://www.lavendo.io/.
Apply Now
Apply Now