SRE(Automation Developer)

🌍 Remote, USA 🎯 Full-time 🕐 Posted Recently

Job Description

About the position

Design, analyze, and troubleshoot large-scale distributed systems.
Participate in on-call rotation, engage with product teams to fix production outages, and carry forward action items to improve ongoing reliability.
Develop effective tooling, alerts, and response to both identify and address reliability risks including automatic problem detection and mitigation.

Minimum 7+ years experience.
Proficient in Linux.
Expert in configuration management tools like Ansible.
Knowledgeable in creating CI/CD pipelines, with Jenkins as a preference.
Skilled in optimizing container builds.
Hands-on experience with Kubernetes or OpenShift.
Comfortable writing scripts in Bash and Python.
Practical experience in building React front-end applications with strong proficiency in JavaScript/TypeScript.
Expertise in developing backend services and APIs, particularly using Python frameworks.
Strong understanding of both SQL and NoSQL databases.
Familiar with task scheduling tools such as Kafka, Redis, and Celery for asynchronous task processing.

Strongly preferred experience in working with production Kubernetes/OpenShift environments.
In depth experience with the Ansible, Python, Terraform, and CI/CD tools such as Jenkins, IBM Continuous Delivery, ArgoCD.
Hands on experience crafting alerts and dashboards using tools such as Instana, New Relic, Grafana/Prometheus.

Apply Now