Job Description
Sr. Site Reliability Engineer (Compute Platform) _Remote
Contract to-Hire
- Must Have
- 6+ years of experience in infrastructure engineering, platform engineering, or DevOps with a strong focus on Compute system design
- Proven experience designing and automating bare metal compute environments at scale
- Strong hands-on experience with PXE boot, network-based OS provisioning, and automated server imaging
- Experience implementing or supporting Bare Metal as a Service (BMaaS) platforms
- Practical experience using Redfish APIs for hardware provisioning, power management, and remote lifecycle operations
- Deep expertise with Ubuntu Linux in enterprise environments
- Strong Hands-on experience with KVM hypervisors (Suse Harvester, OpenStack).
- Experience designing and deploying production-grade Kubernetes clusters
- Strong background with enterprise compute hardware platforms, including Cisco UCS, Dell PowerEdge, Supermicro systems & HPE
- Proficiency with Infrastructure as Code tools (e.g., Terraform, Ansible, or similar)
- Experience building or supporting CI/CD pipelines for infrastructure and platform automation
- Strong scripting skills in Python, Bash, or similar languages
- Demonstrated ability to produce clear, structured technical design documentation
- Excellent written and verbal communication skills
- Bachelor s degree in computer science or equivalent professional experience.
Apply tot his job
Apply To this Job