Site Reliability Engineer
Social network you want to login/join with:
Alter Solutions Portugal is an IT Consultancy Company, promoter of Digital Transformation, part of the Alter Solutions Group, created in 2006, in Paris.
In 2022, Alter Solutions joined the act digital group, constituting a global community of talent in Technology, with presence in thirteen countries: Germany, Belgium, Brazil, Canada, United States of America, Mexico, Morocco, Spain, France, Luxembourg, Poland, Portugal and Serbia. Also in 2023, we were certified as a Great Place to Work.
In Portugal, we partner with over 120 clients and a team of over 500 people, working in projects for industries as diverse as banking, insurance, transportation, aviation, energy, and telecom.
Headquarters of the Nearshore IT center, Alter Solutions Portugal has a dedicated team of around 30 specialized professionals, integrated into projects with several internationally renowned clients.
Job Description
We are looking for a Site Reliability Engineer responsible for improving High Availability and Resilience, better load management with L4 & L7 load balancers, building a dynamic and scalable infrastructure to accommodate
- volume business transactions, and setting up a monitoring system to log performance and capacity levels to ensure high availability of applications with minimal downtime.
Main Responsibilities:
- Design, develop and implement systems software/scripts that improve the stability, scalability, availability, and latency of the Risk system applications.
- Solve problems occurring with our highly available production systems and build solutions & automation using a combination of scripting & tooling to prevent them from happening again.
- Define and drive adoption of a
-
- class monitoring framework to accomplish
-
- end flow monitoring and effective alerting. - Monitor system performance and capacity levels to ensure high availability of applications with minimal downtime.
- Build and run capacity tests to manage the growth of systems.
- Investigate any service disruptions or other service issues to identify their causes.
- Perform regular audits of servers to check for signs of degradation or malfunction, which involves infra hygiene and end of life.
- Conduct
- mortem examinations of failed systems to identify and address root causes. - Accountable for maintenance and improvement of IT continuity strategies.
- Be an advocate of release engineering best practices such as ZERO Downtime, Canary release, Incremental rollouts, etc.
- Work with Development, Dev
Ops, and IT operational teams throughout the Software Development Life Cycle to ensure sustainable software releases.
Qualifications
- 4-6 years of experience in IT Operations/Dev
Ops/Application support/SRE team. - Proven foundation in Linux administration and troubleshooting.
- Solid knowledge of APM Tools Dynatrace / App
Dynamics. - Good understanding of Log aggregators Splunk/ELK.
- Solid work experience with load balancers (L4 & L7), preferably Apache HTTP(d).
- Good understanding of TCP/IP and HTTP protocols and Networking, DNS/Firewalls, F5 Load balancing.
- Experience in Apache Tomcat servers and JVM performance troubleshooting.
- Knowledge of Ansible.
- Knowledge of Jenkins, Ansible, Docker, Kubernetes, and Terraform.
- Knowledge in Open
Stack, Networking, Security, or Storage is desirable. - Solid experience in at least one scripting language; Python preferred.
- Experience with building, operating, and maintaining scalable distributed systems, and with operations automation.
Seja o primeiro a candidar-se à vaga de emprego oferecida!
-
Porque procurar um emprego no Vagas.pt?
Todos os dias oferecemos novas vagas de emprego. Pode escolher entre uma vasta gama de empregos: O nosso objectivo é oferecer a escolha mais vasta possível Receba novas ofertas por e-mail Ser o primeiro a responder a novas ofertas de emprego Todas as ofertas de emprego num só lugar (de empregadores, agências e outros portais de emprego) Todos os serviços para quem procura emprego são gratuitos Vamos ajudá-lo a encontrar um novo emprego