Job Description
About the Role
We are partnered with a highly innovative and disruptive aerospace company building next-generation aviation technology. They are seeking a Site Reliability Engineer to support critical production infrastructure in a fast-paced, high-impact environment.
This role focuses on system reliability, observability, and maintaining performance across live systems.
- What You’ll Do
- Manage and support Kubernetes clusters in production environments
- Implement and improve monitoring, alerting, and observability systems
- Participate in on-call rotations and respond to incidents
- Use Terraform to provision and manage infrastructure
- Troubleshoot and resolve production issues
- Partner with engineering teams to improve system performance and reliability
- What You Need
- Experience in Site Reliability Engineering or similar production-focused roles
- Strong Kubernetes experience in live environments
- Hands-on Terraform experience
- Experience with monitoring tools (Prometheus, Grafana, Datadog, etc.)
- On-call experience and incident response
- Experience with Ansible or similar configuration management tools
- Scripting experience (Python, Bash, or similar)
- Nice to Have
- Experience in high-scale or high-urgency environments (fintech, social platforms, etc.)
- Development experience
- Strong communication skills and ability to work independently
Job Types: Full-time, Contract
Pay: $90.00 - $100.00 per hour
- Benefits:
- 401(k)
- Dental insurance
- Health insurance
- Paid time off
- Vision insurance
Work Location: Remote
Apply tot his job
Apply To this Job