Job code 599
ABOUT THE ROLE
We are looking for an experienced DevOps/Site Reliability Engineering Manager who will bootstrap, grow and lead a new SRE team, defining and setting up processes, ways of working, logging, alerting, on-call rotation, metrics and KPIs for one of our customers private PaaS cloud with an ever-watchful eye on its availability, latency, performance, and capacity.
Our teams are part of an expanding global-scale innovation project, creating automation services to speed up R&D work for tens of thousands of engineers.
We are searching for a person with both managerial and hands-on approach and willingness to contribute to all required tasks. Skilful, communicative and dynamic person, someone who takes ownership of their deliverables, experienced working with agile software development, multiple stakeholders at various levels and ideally with previous experience with complex and large-scale systems.
- Hands-on technical experience.
- 5+ years of mixed professional experience as a Software Engineer, DevOps Engineer or Site Reliability Engineer
- Proven expertise in recruiting and managing a team of enthusiastic, experienced engineers on large scale projects.
- Previously setting up on-call, escalation, monitoring and alert systems
- Capable of technical deep-dives into code, networking, operating systems and storage, yet verbally and cognitively agile enough to hold your own in a strategy discussion with customer’s architects and engineers
- Deploying, managing and optimizing container orchestration using Kubernetes, Helm
- Monitoring and alerting technologies: Grafana, Prometheus, Zabbix, Graphite.
- ELK stack knowledge
- Experience in one or more of C, C++, Java/Kotlin, Go, Python
- Scripting experience: Shell, Perl and/or Python.
- CI/CD pipelines management and associated tools: Jenkins, Spinnaker, GitLab CI
- Source control tools: Git, GitLab, Gerrit.
- Experience working in Agile Scrum teams
- Proficiency in algorithms, data structures, complexity analysis and software design and/or expertise in Unix/Linux systems, IP networking, performance and application issues.
- Expertise in problem solving and analyzing global scale distributed systems.
- Effective management and communication skills.
- MySQL, MariaDB, PostgreSQL, Cassandra, Redis, Galera Cluster previous experience
- RabbitMq/Kafka/ActiveMq knowledge
- Experience with public clouds such as AWS/Azure/GCP
- Infrastructure-as-code methodologies such as Terraform.
WHAT WE OFFER
- Flexible working-from-home policy
- Market leading Pension Package
- Health & Life Insurance
- Excellent opportunities to experiment with new technologies
- Learning and development opportunities
- Comprehensive Relocation Assistance
- Innovative and challenging work culture
- Fantastic central location
If the above sounds like something you would be passionate about and you possess the required skills, we look forward to hearing from you!
We are a dynamic, Irish-owned, professional services company, located in the heart of Dublin City. From banking to Telecoms, from train stations to space stations, we enable customers to keep vital services running. By providing solutions, consultancy and training, we equip our clients to overcome their challenges with tools and people to develop cutting edge technologies in the areas of DevOps, automation CI and Cloud. Our employees get an opportunity to broaden their horizons by working on technologies that will shape the future.
Ammeon is an equal opportunities employer. Ammeon reserves the right to request an employee to be flexible in his or her duties when the business needs require it.
Ammeon does not accept agency resumes and does not take any responsibility for any fees related to unsolicited resumes.