Site Reliability Engineer
*12-month contract with potential to extend/convert*
Hybrid: 2x a week onsite in Orlando
Optomi, in partnership with our premier client in the entertainment industry, is seeking an experience Site Reliability Engineer to join their team for a hybrid role! The Site Reliability Engineer will thrive in a dynamic, high-energy, and collaborative environment. In this position, they will:
- Engage actively with development and QA teams, product and project managers, and product and business teams.
- Lead efforts in project planning, architectural review, and design, and participate in meetings with various teams.
- Provision, tune, and automate systems and applications, providing administration, maintenance, and support.
Technical Skills Required:
- Cloud Services: AWS Cloud (Fargate, ECS, Lambdas, API Gateways, EC2, S3, ALB/ELB, ElastiCache, EKS, KMS-Secret Manager, VPCs, IAM), Google Cloud Platform (App Engine, Kubernetes (Helm/Tiller), Cloud Functions, Firebase, IAM)
- Infrastructure as Code: Terraform/Atlantis, Rundeck, Chef, Ansible, Vault
- Monitoring and Logging: CloudWatch, Splunk, AppDynamics, ElastiCache, Grafana
- Message Queuing: RabbitMQ, PubSub
- Load Balancers
- Programming Languages: Go, Python, Node.js (Angular.js framework), Java (Spring MVC)
Experience and Proficiencies:
- Strong technical background in consumer and employee-facing enterprise systems
- Expertise in troubleshooting applications and systems
- Proficient in maintaining web, caching, and queuing technologies in high-traffic environments
- Skilled in architecting scalable and highly available systems
- Proficient with public cloud platforms (AWS, Google GCP, Azure)
- Experience with containerization and a programming language
- Knowledgeable in distributed version control systems (e.g., GIT) and CI/CD techniques
- Capable of supporting both SQL and NoSQL technologies
Communication and Leadership Skills:
- Excellent communication and relationship-building abilities, able to explain advanced technical topics to both technical and non-technical staff
- Clear communicator in high-pressure situations to all levels of stakeholders
- Representative in incidents involving other business units
- Experienced in troubleshooting during on-call situations with strong analytical and problem-solving skills
- Skilled in technical processes, incident response, and change management (ITIL experience)
- Experienced with compliance and vulnerability management
- Capable of collecting and articulating forensic details for root cause analysis
- Self-starter who can lead project planning efforts, architectural design, and attend team and product meetings
Mentorship and Task Management:
- Guide other Systems Engineers in prioritizing workload, discussing technical challenges, and collaborating on effective solutions
- Manage multiple tasks from start to finish
- Responsible for breaking down tasks to meet project objectives and ensuring timely completion of tickets and task