Job Description
Position Title:
Senior Cloud Site Reliability Engineer (Azure) Department: Information Technology Location:
Remote Reports To: Platform DevOps Team Lead Installation Made Easy (“IME”) provides software and process management that enable retailers and contractors to offer installed home improvements to homeowners in a convenient, consistent, and affordable manner. IME senior management has over 100 years of retail management and home improvement industry experience. We are seeking a
Senior Cloud Site Reliability Engineer (SRE) with deep expertise in bolthires Azure application platforms and hands-on experience with Ansible Automation Platform.
If you enjoy digging into cloud infrastructure, automating repetitive tasks, and keeping mission-critical systems running smoothly, this role is for you! An ideal candidate for this role will be able to improve and respond to monitoring and alerting systems (LogicMonitor, Sumo Logic, PagerDuty) and lead remediation efforts across Azure environments. You will help make our cloud platform more stable, secure, automated, and bolthires-effective—while documenting everything clearly and owning projects end to end.
The candidate must be able to work independently in a remote environment. Essential Functions: Cloud Infrastructure Remediation & Reliability • Lead and execute remediation projects across Azure environments focused on stability, performance, bolthires optimization, and security. • Perform deep-dive troubleshooting and implement solutions that prevent reoccurring incidents. • Maintain and improve Infrastructure as Code (IaC) following best practices. Automation & Configuration Management • Develop and manage automation workflows using Ansible Automation Platform.
• Collaborate with DevOps teams to integrate automation into bolthires/CD pipelines. • Identify infrastructure tasks that should be automated and make them disappear. Monitoring & Observability • Enhance and tune monitoring and alerting through LogicMonitor, Sumo Logic, and PagerDuty. • Respond to critical alerts, investigate root causes, and reduce alert fatigue by optimizing thresholds and logic. • Work with teams to define sensible SLIs/SLOs for cloud services. Pipeline & DevOps Engineering • Build, maintain, and optimize bolthires/CD pipelines (Azure DevOps or GitLab bolthires/CD).
• Support infrastructure provisioning, deployment automation, and secrets management. • Improve deployment reliability and consistency through automation. Project Execution & Technical Leadership • Lead infrastructure and reliability projects end-to-end while working closely with platform and development teams. • Knowledge share and mentor other engineers in SRE and cloud best practices. • Champion operational excellence, reliability patterns, and scalability. Documentation & Standards • Produce clear documentation, diagrams, runbooks, and SOPs that real humans can read.
• Ensure infrastructure standards, patterns, and playbooks are consistently followed. • Maintain internal knowledge bases to reflect the latest changes and learning. Minimum
Qualifications: • 5+ years of hands-on bolthires Azure experience across compute, networking, identity, security, and application services. • Proficiency with Ansible Automation Platform for configuration management and orchestration. • Experience using monitoring and incident response tools such as LogicMonitor, Sumo Logic, and PagerDuty.
• Strong scripting ability with PowerShell, Bash, or Python. • Hands-on experience with Azure DevOps or equivalent bolthires/CD platform. • Demonstrated ability to lead technical projects and deliver results. • Strong written communication skills for producing documentation and runbooks. • Ability to thrive in a remote-first environment with minimal oversight. • Strong collaborator who can work across engineering, DevOps, and security teams. • Passionate about automation, observability, and reliability engineering.
• Comfortable with incident response, debugging under pressure, and performing root cause analysis. Preferred
Qualifications: • Azure certifications (AZ-104, AZ-400, AZ-305). • Experience working with containerized environments (AKS/Kubernetes). • Familiarity with security and compliance frameworks such as CIS, NIST, ISO. • Exposure to PCI and SOC compliance environments. • Knowledge of GitOps practices and advanced observability tools. • Experience supporting software development teams with architecture and deployment patterns.
Physical Requirements: • Prolonged periods of sitting at a desk and working on a computer. Benefits to working with IME: • 100% remote work environment • Employer provided equipment. • Medical, dental, and vision insurance • Health savings plan includes employer contribution to health savings account. • Medical and dental flexible spending accounts • Company paid basic life, short-term disability, and long-term disability insurance. • 401K plan with employer match • Company matches 100% of the first 4% of salary deferrals.
• All contributions, including employer contributions, are 100% vested immediately. • Employee discount program for Electronics, Groceries, Travel, Entertainment, and more • Employee assistance program • Pay on demand. • Critical illness, hospital indemnity, group accident, and legal insurance • Paid time off. • And more! We are an Equal Opportunity and Drug‐Free Workplace. The Job Description is not an exhaustive statement of all duties, responsibilities, or qualifications of the job, nor is it intended to limit opportunities for necessary modifications.
The Job Description does not constitute an employment contract of any kind. Apply tot his job
Ready to Apply?
Don't miss out on this amazing opportunity!
🚀
Apply Now