Role : Site Reliability Engineer Lead
Fulltime
Location : Austin TX (Onsite)
Seeking Site Reliability Engineer Lead. This position's primary responsibility will be to manage a team of SREs to proactively ensure the stability, resilience and scale of our services by automation, testing and engineering. To build on expertise from product teams’ systems/operations, cloud infrastructure (AWS/GCP), build and release engineering, software development and stress/load testing to make sure our services are available, cost optimized and fit for purpose early in the development lifecycle. The SRE Lead will also work alongside the development, architecture and service management teams, to ensure technical solutions are aligned to architectural principles, that deliver value to our customers as well as ensuring consistent monitoring, logging and alerting. The SRE Lead is responsible for building capability and maturing operational ways of working across multiple cross-function delivery teams, with focus on technical excellence and a high-performance culture.
This position is based in Austin TX. Candidate should be located within commuting distance or be willing to relocate to this area. This position may require relocation and or travel to project locations.
U.S. citizens and those authorized to work in the U.S. are encouraged to apply.
Required Qualifications
- Bachelor’s degree or foreign equivalent required from an accredited institution. Will also consider three years of progressive experience in the specialty in lieu of every year of education.
- At least 4 years of IT industry experience.
- Experience in DevOps, Cloud experience (any of PCF, AWS, GCP, Azure), support experience
- Experience in automation using Scripting/Programming knowledge (bash, PowerShell, or python).
- Experience in administration of ServiceNow, Harness, Jira, Bamboo and other Atlassian products.
- Expert in Logging and Monitoring tools (Splunk, ThousandEyes, Prometheus, Grafana), incorporating frameworks and instrumentations into C# code.
- Highly proficient with Kubernetes, Terraform and AWS/GCP.
Preferred Skills:
- Atleast 6 years of experience in DevOps, Cloud experience (any of PCF, AWS, GCP, Azure), support experience
- Atleast 6 years of experience in automation using Scripting/Programming knowledge (bash, PowerShell, or python)
- Operational experience in maintaining applications
- Strong leadership skills to ensure scrum teams and co-workers are motivated and engaged to deliver against a roadmap
- Has significant experience in evolving practices and ways of working through multi-disciplinary teams, business frameworks and culture
- Has strong project management background and experience in leading technology change programs
- An individual who can perform highly in a multi-faceted role – facets that include a very strong technical knowledge, and awareness of emergent trends
- A very strong communicator, able to lead and facilitate discussions across functions like architecture, technical specialists, business analysis, team leaders, senior management group, and executives
- Experience working with Windows and Linux Containers (focus currently on Windows)
- High understanding in NF testing (Performance, Security, Cost Optimization etc)
- Ability to get up to speed with domain knowledge