Home Bahrain Senior Site Reliability Engineer FinTech

Home Bahrain Senior Site Reliability Engineer FinTech

Senior Site Reliability Engineer FinTech

Full time at Booking Holdings in Bahrain
Posted on April 7, 2024

Job details

Job Details

Senior Site Reliability Engineer I are specialists in treating operations as a software problem. They focus on reliability of systems and services - addressing availability, performance, scalability, latency, observability, efficiency. They work on maintaining key components and developing systems that will minimize human labor (through automation) and increase system reliability with the end goal of breaking the relationship between system size, operational toil and complexity.
  1. Senior SRE I are responsible for the implementation of technical solutions based on business requirements, they can estimate the effort and impact of the items they work on, and show a high quality of craft in what they deliver.
  2. Senior SRE I work primarily within the scope of their team while occasionally collaborating across partner teams. They are expected to work together with colleagues (potentially in other job roles) to design and implement technical tasks. They are also expected to actively participate in incident response for issues affecting their team.
  3. Because the required technical skills and commercial knowledge can vary from one area to another, Senior SRE I can wear several hats; part of a business service owner team, owner of a piece of infrastructure, and/or consultant to product development teams regarding Site Reliability Engineering related scope.
Building software applications
  1. Is responsible to build software applications by using relevant development languages and applying knowledge of systems, services and tools appropriate for the business area.
  2. Is responsible to refactor and simplify code by introducing design patterns when necessary.
  3. Is responsible to ensure the quality of the application by following standard testing techniques and methods that adhere to the test strategy.
  4. Has sufficient knowledge to write readable and reusable code by applying standard patterns and using standard libraries.
  5. Has sufficient knowledge to maintain data security, integrity and quality by effectively following company standards and best practices.
Software Systems Design
  1. Has sufficient knowledge to evaluate possible architecture solutions by taking into account cost, business requirements, technology requirements and emerging technologies.
  2. Has sufficient knowledge to describe the implications of changing an existing system or adding a new system to a specific area, by having a broad, high-level understanding of the infrastructure and architecture of our systems.
  3. Has sufficient knowledge to help grow the business and/or accelerate software development by applying engineering techniques (e.g. prototyping, spiking and vendor evaluation) and standards.
  4. Has sufficient knowledge to meet business needs by designing solutions that meet current requirements and are adaptable for future enhancements.
End to End System Ownership
  1. Has sufficient knowledge to own a service end to end by actively monitoring application health and performance, setting and monitoring relevant metrics and act accordingly when violated.
  2. Has sufficient knowledge to reduce business continuity risks and bus factor by applying state-of-the-art practices and tools, and writing the appropriate documentation such as runbooks and OpDocs.
  3. Has sufficient knowledge to reduce risk and obtain customer feedback by using continuous delivery and experimentation frameworks.
  4. Is responsible to independently manage an application or service by working through deployment and operations in production.
  5. Has sufficient knowledge to maintain data security, integrity and quality by effectively following company standards and best practices.
Technical Incident Management
  1. Has sufficient knowledge to address and resolve live production issues by mitigating the customer impact within SLA.
  2. Has sufficient knowledge to improve the overall reliability of systems by producing long term solutions through root cause analysis.
  3. Has sufficient knowledge to keep track of incidents by contributing to postmortem processes and logging live issues.
Automation and toil reduction
  1. Has sufficient knowledge to ensure that infrastructure stays current by reducing technical debt, searching for bottlenecks and preparing for scaling.
  2. Has sufficient knowledge to reduce cost of operations and maintenance by leveraging new technologies, automation, and partner with vendors to ensure we stay current.
  3. Has sufficient knowledge to reduce human labor by writing small software features that address availability, scalability, latency and efficiency.
#J-18808-Ljbffr

Apply safely

To stay safe in your job search, information on common scams and to get free expert advice, we recommend that you visit SAFERjobs, a non-profit, joint industry and law enforcement organization working to combat job scams.

Share this job
See All Senior Jobs
Feedback Feedback