Lead Site Reliability Engineer | Curology | Remote US

🌍 Remote, USA 🎯 Full-time 🕐 Posted Recently

Job Description

Mission of the Role:Architect and lead the delivery of high-quality and reliable solutions through creative problem-solving and technical expertise to address our business problems on a frequent and regular cadence. Enable Engineers on your team to improve the quality and impact of their work and delivery. Evangelize reliability-as-a-feature through monitoring, service-level objectives, automation, everything-as-code, and testing. Essential Functions and Impact Areas: • Provide technical leadership and guidance to the SRE team, driving best practices in reliability engineering, automation, and service management. • Set the direction for SRE projects, aligning them with organizational goals, and ensuring successful execution from concept to delivery. • Helps define and instrument Service-Level Objectives to ensure the most excellent customer experience. • Lead initiatives to improve system resilience and scalability. • Hosts postmortems to share learnings, discover gaps, embrace transparency, and improve reliability across our services. • Leads projects from inception to completion. • Participates in an on-call rotation to assist in finding a resolution during incidents. Minimum Skills & Requirements: • 7+ years of experience building infrastructure solutions in AWS using Infrastructure-as-Code technologies such as Terraform or CloudFormation. • 7+ years of experience working with Docker containers and related orchestration technologies (such as Kubernetes or ECS). • 7+ years of experience building and deploying CI/CD pipelines. • Experience with AWS, Docker, Kubernetes, Terraform, Python, PHP, and Laravel • Experience with architectural patterns of large, high-scale applications, such as well-designed APIs and database schemas. • Experience leading projects and initiatives that are wide in scale and complex in nature. • Experience working collaboratively in cross-functional teams with engineers in product and data groups. • Deep technical expertise; Writes, debugs, and refactors code while being mindful of tradeoffs, scalability, architecture, and code cleanliness. • Demonstrates mastery of their craft to solve problems in automation, infrastructure, and/or developer tooling. • Reliability & Quality; Experience leveraging observability tooling and practices such as SLOs to help engineering teams own the reliability and quality of the software they build. • Leadership – Define and deliver large, complex projects that may include coordination with non-technical stakeholders. Help define the SRE function and be a champion for it throughout the organization. Why You’ll Love Working at Curology: • Competitive salary and equity packages • Company Performance Incentive Plan • Comprehensive benefits: medical, dental, and vision insurance for employees; flexible spending account; 401k; mental health & wellness programs • Company Performance Incentive Plan • $75 WFH stipend (remote employees) • Home office setup stipend (remote employees) • Minimum Time Off policy (unlimited PTO, with at least 3 weeks off) for exempt employees • 11 company observed holidays • Additional holidays: Curology days off (1 per quarter), 1 annual floating holiday (employee’s choice), and Gratitude Week (employees take the full week of Thanksgiving off; business critical teams observe different days) • Paid parental leave • Employee donation matching program • Company-sponsored events • Free subscription to Curology or Agency Apply tot his job