Site Reliability Engineer C-477

Site Reliability Engineer C-477
Empresa:

Smash Cr


Detalles de la oferta

SMASH, Who we are? We are agents for tech professionals in Costa Rica and Colombia that help them build careers in the United States.  We believe in long-lasting relationships with our talent. We invest time getting to know them as individuals and understanding what they are looking for as their professional next step.  We aim to find the perfect match. As agents, we make sure to pair our talent with our US clients, not only by their technical skills but as a cultural fit. Our core competency is to find the right talent, fast. We purposefully move away from the "contractor" or "outsourcing" type of relationship. Our clients don't want contractors or "just a service." Neither does our talent.   Our Benefits Work from Home English Academy for Employees and Relative Business Skills Coach – Certifications Discounts with Tech Universities Events and additional Perks Job Description  The Site Reliability Engineer is responsible for keeping all member-facing and internal production systems running smoothly. As an SRE engineer you will work with multiple teams to encourage SRE principles, maintain the availability and reliability of systems, establish SLIs/SLO's, and develop tools and monitoring for operational visibility. SRE engineers are members of the scrum teams and work closely with quality and software engineers to support services prior to general availability through activities such as launch reviews, reviewing performance and validating logging in dev environments. Responsible for ensuring quality releases to production environments. The SRE engineer participates in an on-call rotation, working with internal and vendor teams to manage, troubleshoot and resolve production issues. To be effective, an individual must be able to perform each job duty successfully. Keep current with emerging testing techniques and technologies, as well as emerging development practices. Assist in diagnosing, finding the root cause, reporting, and tracking production and non- production issues. Continually researching new ways of improving and scaling systems and services. Lead initiatives to improve the reliability, scalability, and availability of production applications. Build out tools, platform, and processes to enable these goals. Lead and contribute to design, develop, and improve SRE practices and procedures. Create and maintain health dashboards, identifying and measuring health indicators, SLI's/SLO's and providing tools for operational visibility of production systems. Participate in and contribute to improving our incident response acting as an escalation point for production incidents. Perform root cause analysis (RCA), troubleshoot, and debug issues across our applications and services to identify and fix root cause. Enhance and maintain the software release procedures and processes. A strong desire and aptitude for system automation to eliminate manual work with day-to- day operations. Skilled with application monitoring practices and tools (New Relic, Azure Monitor, DataDog, Splunk, etc.) Understanding of and experience with SRE and DevOps principles. Demonstrated experience working in Agile teams leveraging Scrum, Kanban, or other methodologies and/or understanding of Agile development concepts. Meets the needs of the end user in a quality, consistent, and professional manner, using independent judgment where appropriate. Mentors less experienced engineers. Excellent communication skills (verbal and written) are critical, along with exceptional problem-solving skills, and exceptionally professional behavior when interacting and responding with other technical teams throughout the organization. Take part in an on-call rotation. Performs additional duties and responsibilities as assigned. Experience  Minimum 4 years of professional experience in site reliability engineering, software development, or systems administration Experience monitoring or troubleshooting web applications. Experience with Scrum and associated tools such as Azure DevOps or Jira Experience with some of the following tool sets: Application monitoring tools (New Relic, DataDog, Splunk, etc.) Automation tools (Pega, Microsoft Power Platform, Logic Apps, etc.) o API tools (Rest#, Postman, Swagger, etc.) Front end tools (Selenium, Page Object Model, etc.) Backend tools (SQL Server, Entity Framework, Dapper, etc.) Build tools (Node, Docker, Azure Pipelines, etc.) Infrastructure as Code(Terraform, Ansible, Chef,etc.) Experience with automating, monitoring, and\or alerting on some of the following: Web applications in Angular and React Internal support tools 3rd party integrations Database and API connections (Rest and SOAP) Cloud Solutions (AWS, Azure, or others) Experience working in an agile CI/CD or rapid software testing environment. Experience understanding of Git and source control concepts. Powered by JazzHR


Fuente: Talent_Ppc

Requisitos

Site Reliability Engineer C-477
Empresa:

Smash Cr


Sre Engineer

At Encora we are looking for a great talent like you to join our team as the next **_SRE _**_Engineer (6301)_**Would you like to join our great team of engin...


Desde Encora - San José

Publicado a month ago

Senior Software Engineer

**Are you ready to make an impact?****West Monroe** is seeking a **Senior Software Engineer, **to join our team in our Product Experience and Engineering Lab...


Desde West Monroe Experienced - San José

Publicado a month ago

Analytics Engineer

**Who are we?**At 360training, we promote a culture of excellence that fosters the success of our employees, while maintaining a team-centric environment whi...


Desde 360Training - San José

Publicado a month ago

Advance Networking Engineer

**Customer Service/Support****Location**- Pavas, Costa Rica***Job Title:Advance Networking Engineer 2024**Job Description**:The Advisor I, Technical Support ...


Desde Webhelp - San José

Publicado a month ago

Built at: 2024-05-15T14:00:06.191Z