Senior Director, Software Engineering – AI ML Engineering

🌍 Remote, USA 🎯 Full-time 🕐 Posted Recently

Job Description

About the position

We’re looking for a hands-on technical leader to architect, fine-tune, and deploy on-device small language models (SLMs) for consumer security at scale. You’ll lead a focused team of 3–5 senior engineers while remaining deeply involved in the code and technical architecture.

Your core responsibility is building high-performance, privacy-preserving AI models that run directly on user devices (Mac, iOS, Android, Linux). You’ll own model optimization, fine-tuning for tool-use accuracy, evaluation frameworks, and cost-aware deployment strategies. While you won’t own the agent orchestration platform itself, you’ll work closely with it to ensure models behave correctly in multi-turn conversations and make reliable tool-calling decisions.

This role sits at the intersection of edge ML, applied LLMs, and production engineering. Success requires navigating real-world tradeoffs: latency vs. capability, privacy vs. accuracy, on-device vs. cloud execution, and cost vs. performance.

This is not a traditional director role. You’ll spend 60%+ of your time on technical architecture and implementation, with the remainder focused on mentoring senior engineers and setting technical direction.

This is a Hybrid remote position located in a hub location of Frisco, TX or San Jose, CA. You will be required to be onsite on an as-needed basis, typically 1-4 days per month. We are only considering candidates within a commutable distance to this location and are not offering relocation assistance at this time.

Design and deploy small language models optimized for on-device inference (Mac, iOS, Android, Linux)
Lead model optimization efforts including quantization, pruning, distillation, and efficient inference pipelines
Fine-tune models to improve tool selection accuracy and conversational behavior in security-focused workflows
Build evaluation frameworks to measure model efficacy, tool-calling accuracy, conversation quality, and safety in production
Create synthetic data and workflow simulations to train and validate security-relevant conversations
Partner closely with agent orchestration systems to optimize multi-turn dialogue behavior and state handling
Implement cost-optimization strategies such as intelligent on-device vs. cloud routing, prompt caching, batching, and token efficiency
Integrate cloud-based LLMs when deeper reasoning or broader context is required
Build production ML systems that detect threats and protect users directly on-device
Set technical standards and architectural direction for AI/ML across the security platform
Mentor principal engineers and architects while remaining hands-on

10+ years of software engineering experience, with 5+ years focused on ML/AI
Proven experience shipping ML models to production with transferrable skills to deploy these on edge or mobile platforms
Experience with conversational AI systems and tool/function-calling architectures
Strong Python and systems programming skills (C++ or Rust) for performance-critical code
Deep expertise in model optimization (INT4/INT8 quantization, pruning, distillation)
Hands-on experience with PyTorch and at least one edge deployment framework (TensorFlow Lite, CoreML, ONNX Runtime, or llama.cpp)
Experience building evaluation and benchmarking frameworks for ML systems

Experience applying ML systems in security, safety, or other adversarial domains
Master’s degree in CS, ML, or a related field (or equivalent practical experience)