This job description is for our direct client who is hiring immediately and is one of the largest companies in the US with more than 30,000 team employees all around the US.
About Outcome Logix:
We are a self-funded organization driven by a passion to reshape the hiring landscape, established in 2020 during the pandemic's peak. Recognizing a significant gap in the candidate submission process employed by most service companies, we embarked on a mission to revolutionize hiring and create an unparalleled experience for hiring managers and candidates. Before launching our professional staffing services, we invested in developing a robust Applicant Tracking System (ATS) with an integrated interviewing platform. Our approach goes beyond the conventional practice of merely matching keywords on a resume. Instead, we meticulously align each requirement with a dedicated team comprising Recruiters, Account Managers, fractional Recruiting Managers, and industry experts who evaluate candidates based on core competencies. This meticulous process not only saves time and effort but also ensures that candidate submissions consistently exceed expectations.
Roles and responsibilities:
- Has end-to-end availability, security, and performance of mission-critical applications and services that are part of the Digital systems.
- Analyze technical issues identify the root cause and provide fix in production environment. (Never solve the same problem twice)
- Partners with multiple internal teams to groom the nonfunctional requirements and work on implementations.
- Automate or streamline manual tasks and redundancies within the infrastructure organization.
- Implement best SRE practices to ensure availability/reliability and fault tolerance wherever applicable.
- Drive product reliability improvements through monitoring, alerting, and application of software development best practices.
- Identify creative ways to break the system, uncover and report nonfunctional defects, as well as validate that systems/solutions are operating as intended.
- A strong critical thinker who identifies problems before they happen.
- Troubleshoot performance and stability issues using a wide variety of tools.
- Evaluate and manage application and environment security.
Qualifications:
- 6+ Years of total IT experience (Reliability engineering role and JAVA).
- Hands-on experience with performance analysis, scalability, and reliability testing techniques. Python scripting gives you an additional advantage.
- Strong knowledge and hands-on experience with observability tools such as Datadog and New Relic.
- Hands-on experience with any cloud service concepts, preferably AWS. Microservices experience gives you an additional advantage.
- Hands-on experience with SRE practices and writing.
- Knowledge of HCL commerce and IBM sterling platforms is advantageous.
- A strong critical thinker who identifies problems before they happen.
- Strong written and oral communication skills with a high degree of comfort speaking with engineering management, developers, and leadership.
- Demonstrated ability to adapt to new technologies and learn quickly.
Note: This role requires candidates to be currently residing in Boston or willing to relocate to the area. Visa preferences are not a factor, as long as you possess the necessary skills and motivation we are seeking.