CornerStone Technology Talent Services is seeking a highly skilled Application Monitoring Analyst to monitor, alert, and support critical systems to ensure seamless operations for our client. The ideal candidate will have 3-5 years of experience with monitoring tools like Dynatrace or CloudWatch, and a solid understanding of cloud architecture and DevOps principles.
Location: This position requires 50% onsite presence in North Fort Worth. Flexibility for weekend availability may be needed occasionally.
Work Environment: This role operates in a 24/7 environment, and after-hours support is required.
Key Responsibilities:
- Incident and System Management: Collaborate with internal teams and suppliers to analyze and resolve critical IT and Telecom service interruptions. Maintain system availability through incident, problem, and change management.
- System Monitoring and Optimization: Monitor systems for faults, identify optimization opportunities, and implement tools and process changes to enhance monitoring and alerting efficiency.
- Incident Response and Root Cause Analysis: Engage with major incident response teams for escalations and monitor ongoing major incidents.
Qualifications:
- Self-Motivation: Ability to define, develop, and execute plans; manage system outages; and handle high-stress situations effectively.
- Availability: Willingness to work in a 24/7 environment and provide on-call support as needed.
- Experience: Proven track record of interacting at all levels within an organization.
- Technical Skills:
- Bachelor's degree in Computer Science, Information Systems, or Engineering preferred.
- Technical certifications or 5+ years of experience in event monitoring and alerting, DevOps, Infrastructure Support, or IT Major Incident Management.
- Proficiency with monitoring tools (e.g., Dynatrace, CloudWatch, Zabbix, SCOM).
- Experience in DevOps application performance tuning.
- Strong writing skills for documentation.
- Expertise in distributed systems/administration (Windows, Unix, Linux, VMWare, etc.).
- Knowledge of ITIL best practices (certification is a plus).
- Familiarity with the Software Development Lifecycle (SDLC).
- Experience in SLA/KPI-driven environments.
- Proficiency with ServiceNow.
- General scripting/programming skills (Python, Node.js, Ruby, Perl, Bash/sh).
Preferred Qualifications:
- Cloud certifications (e.g., AWS, Azure).
- Experience with infrastructure as code tools (e.g., Terraform, Ansible).
- ITIL V3 or V4 certification.
- Advanced technical skills across various operating systems and environments.
- Proven ability to improve monitoring and alerting processes.