Site Reliability Engineer

Rhe role

We are looking for a dedicated and skilled Site Reliability Engineer (SRE) to support and optimize AWS infrastructure for critical systems, including website operations, mobile apps, BFF APIs, geolocation services, and Magnolia CMS. This role focuses on ensuring reliability, uptime, and performance while driving continuous improvements and operational excellence. This is an exciting opportunity for someone passionate about proactive problem-solving, and advancing their technical expertise in a collaborative environment.

What you will do

Support AWS infrastructure for

Website: Ensuring the operation, reliability and optimal performance of the website, including the underlying AWS infrastructure. You will focus on greatly enhanced uptime. The strong focus on keeping a software platform or service running is the foundation of SRE
Backend for Frontend (BFF) APIs: Support and optimisation of the BFF APIs that facilitate communication between front-end interfaces and backend services, including AWS infrastructure
Mobile Apps: Support and reliability improvements for the new iOS and Android native applications
Geo Location: Support and maintenance of the live geolocation service provided by the BFF
Content Management System (CMS): Management and support of Magnolia CMS and any associated AWS infrastructure

This role includes rotational on-call support after business hours, with responsibilities distributed in compliance with labour law regulations.

Requirements

Cloud Architecture Automation: Expertise in automating cloud infrastructure using Infrastructure as Code (IaC) tools, preferably Terraform. Experience in designing scalable, resilient architectures on cloud platforms such as AWS
Build and Deployment Pipelines: Proficiency in setting up and managing Continuous Integration/Continuous Deployment (CI/CD) pipelines using tools, preferably GitHub Actions. This ensures rapid, reliable code deployment and reduces time to market
Software Development Skills: Competence in programming (Java) and scripting languages to develop automation scripts, tooling, and to contribute to the codebase
Monitoring and Observability: Implementing advanced monitoring solutions (e.g., AppDynamics, Prometheus, Grafana, New Relic) to gain insights into system performance and user experience, enabling proactive issue detection
Log file querying and analysis: using tools such as Splunk or ELK components
Incident Management and Response: Leading incident response efforts, performing root cause analysis, and implementing long-term solutions to prevent recurrence
Performance Optimisation: Analysing system performance and implementing optimisations to enhance speed, scalability, and efficiency
Containerisation and Orchestration: Proficiency in managing containerised applications using Docker. Experience working with AWS Elastic Container Service (ECS) to deploy and maintain scalable, secure, and efficient containerised solutions
Collaboration with Development Teams: Working closely with developers throughout the software lifecycle, participating in code reviews, and contributing to architectural decisions to embed reliability from the ground up
Automation of Manual Tasks: Identifying repetitive tasks and automating them to reduce toil, increase efficiency, and allow focus on strategic improvements
Customer-Centric Focus: Prioritising user experience by ensuring that services not only are available but also perform optimally from the user’s perspective
Proficiency in written and spoken English

What we offer

Hybrid / remote working model (*) and flexible working hours
Private Health Insurance
People Lead system for your personal development
A culture of continuous growth, providing various training resources
Referral System
Technical equipment you can choose
Agile mindset, simplified processes, and a great atmosphere where commitment and autonomy are celebrated
A community-first mindset working with talented people across technology products and consulting

(*) As we hire permanent employees for this role, we offer remote opportunities only in Romania, Lithuania and Hungary.

Additional benefits based on location:

Hungary:

20 vacation days, increasing according to labour law
Extensive cafeteria package

Lithuania:

20 vacation days
2 emergency days
Pension plan
Opportunity to work abroad for up to 180 days per year
Discounts from the benefit platform

Romania:

24 vacation days, increased based on the work tenure
2 emergency days
Meal tickets
Opportunity to work abroad for 1 month

About us

Zenitech is a leading technology solutions provider dedicated to reshaping the global digital landscape. Headquartered in the UK, Zenitech operates internationally, with offices in Lithuania, Romania, and Hungary.

We use a bespoke approach depending upon where the client is on their digital journey, comprising a combination of access to dedicated R&D labs, technology implementation advice, and specialist nearshore development talent. As an international community of individuals who are open to learn from each other, we collectively define and input into the digital future of the clients’ businesses.

Why Zenitech?

Impactful Projects: Drive meaningful change through digital transformation projects and have an opportunity to make an impact on many different industries.
Collaborative Culture: Be part of a diverse, inclusive team committed to growth, innovation, and continuous learning.
Professional Growth: Zenitech supports continuous learning and development through the People Lead system, helping you advance your skills and career.

Diversity and Inclusion

Zenitech celebrates diversity in all its forms. We aim to create an inclusive environment where everyone feels valued for their unique contributions and perspectives. If you require any adjustments during the application process, please let us know—we’re here to help. Our commitment to diversity, equity, inclusion, and belonging can be found here.

Site Reliability Engineer

Employment Level:

Mid

Office Location:

Hungary (remote); Lithuania (remote); Romania (remote)

Department:

Automation Engineering

Working Model:

Remote

Contract Type:

Full-Time,

Regular Permanent Employment

Rhe role

What you will do

Requirements

What we offer

Additional benefits based on location:

Hungary:

Lithuania:

Romania:

About us

Why Zenitech?

Diversity and Inclusion

Policies & Statements

What we do

Job Opportunities

Sign up to stay in touch

Follow us