Site Reliability Engineer

Altimetrik • san jose, ca, us • 8h ago

We are looking to hire a Site reliability Engineer

Educational Background: Holds a bachelor’s or master’s degree in computer science, information technology, or a related technical field. Alternatively, significant work experience in DevOps or cloud infrastructure management could offset a formal degree requirement.

Cloud Infrastructure Expertise: Has at least 4+ years of hands-on experience specifically with AWS, including launching, maintaining, and scaling applications on this platform. The candidate should also demonstrate familiarity with core AWS services, security practices, and cost management.

Performance Monitoring and Troubleshooting: Proven track record in monitoring and troubleshooting performance issues across cloud environments. This includes experience tuning for optimal performance and ensuring high availability in production environments.

DevOps Automation Skills: Proficient in scripting languages like Python, Ruby, or Bash for automation and pipeline development within a DevOps framework. The candidate should also be able to use these skills for tasks like CI/CD, configuration management, and infrastructure as code (IaC).

Programming Proficiency: Comfortable working in at least one of the following languages: Java, Python, or Ruby. This experience helps the candidate understand and work with software applications from both an operational and developmental perspective.

Containerization and Orchestration: Hands-on knowledge of Docker and Kubernetes, along with tools like ArgoCD for GitOps workflows. Experience in this area is essential to manage and scale containerized applications effectively.

Monitoring and Observability Tools: Proficient with monitoring and observability tools such as Splunk, Wavefront, AppDynamics, Prometheus, and other tracing tools. Experience in configuring and maintaining these tools is necessary to ensure visibility across the infrastructure.

Apply