Site Reliability Engineering Manager
Shield is a global startup, with offices in Tel-Aviv, New-York, London, and Lisbon.
We’re growing and looking for another important piece of the puzzle.
Is it you?
Let’s get down to business:
What you will do
Key Responsibilities:
- Establish and nurture a culture of excellence within the SRE team, promoting best practices, effective work processes, and methodologies. Lead by example and mentor the team to foster a collaborative and
- performing environment. - Set clear team goals and priorities in alignment with organizational objectives. Ensure resources are available and allocated efficiently to meet project timelines and deliverables.
- Recruit, train, and develop team members, providing guidance and support to enhance their skills and career progression. Encourage continuous learning and adaptability to new technologies and methodologies.
- Design, implement, and maintain scalable and reliable infrastructure solutions.
- Develop and deploy monitoring, alerting, and logging systems to proactively identify and mitigate operational issues.
- Review and refine existing alerts, working closely with developers to automate responses and enable
- healing. - Develop and maintain monitoring dashboards that provide clear and actionable insights into application reliability and system performance.
- Conduct capacity planning and performance tuning to optimize system performance and resource utilization.
- Automate repetitive tasks and processes to streamline operations and improve efficiency.
- Lead incident response and resolution, including rapid troubleshooting, coordinating
- functional teams, root cause analysis, and
- mortem reviews. - Develop and maintain incident response procedures and runbooks to ensure efficient and effective handling of incidents.
- Communicate effectively with stakeholders during incidents, providing timely updates and managing expectations.
- Continuously evaluate and adopt new technologies and methodologies to enhance our infrastructure and operations.
- Oversee and optimize our cloud infrastructure on AWS, ensuring scalability, reliability, and
- effectiveness. - Regularly analyze cloud service usage and expenses, implementing strategies to optimize costs.
Requirements:
- Bachelor’s degree in Computer Science, Information Technology, or a related field.
- 6+ years of experience as a site reliability or platform engineer, preferably in a
- scaling environment - At least 2 years in a leadership role, demonstrating effective team management, mentorship, and strategic planning.
- Hands-on experience with Terraform and Terragrunt.
- Extensive knowledge of Kubernetes and containerization technologies.
- Hands-on experience with the Prometheus stack.
- Ability to design and develop code using Python or Go.
- Strong inclination toward automating manual tasks and processes to improve operational efficiency.
- Excellent troubleshooting abilities with a methodical approach to diagnosing and resolving issues.
- In-depth knowledge of cloud services, particularly AWS, including best practices in security and compliance.
- Excellent communication abilities to coordinate effectively with both technical and
- technical stakeholders.
Oh hey, you made it all the way here!
So, in case you were wondering, Shield is how compliance teams in financial services can finally read between the lines to see what their employee communications are really saying.
We are a
- Series B startup ($35m) with some of the largest financial organizations in the world as investors and customers. Our platform analyzes digital interactions to do good in the world: we help protect market integrity and people’s financial assets.
Shielders listen more intently. Pay closer attention to the details. Make the extra effort. Care. It’s what we do at Shield every day, and not just for our customers, but for everyone we work with. It’s all about creating a world where people understand and trust each other.
-
Informações detalhadas sobre a oferta de emprego
Empresa: Shield Localização: Lisboa
Lisboa, Lisboa, PortugalPublicado: 15. 3. 2025
Vaga de emprego atual
Seja o primeiro a candidar-se à vaga de emprego oferecida!