Johannesburg, South Africa
3 days ago
Lead: Site Reliability Engineer

Let's Write Africa's Story Together!

Old Mutual is a firm believer in the African opportunity and our diverse talent reflects this.

Job Description

ROLE OVERVIEW

The Head of Site Reliability Engineering (SRE) is a critical leadership position responsible for ensuring the bank's technology systems and services are reliable, scalable, and resilient. This role requires a deep understanding of infrastructure, monitoring, incident management, and automation, as well as a strong ability to lead and inspire a team of SRE engineers. The successful candidate will play a pivotal role in driving operational excellence, optimizing service delivery, and fostering a culture of reliability across the bank's digital ecosystem.

KEY RESULT AREAS

Strategy & Leadership

Define and implement the SRE strategy, ensuring alignment with the bank's business and technology goals.

Lead initiatives to enhance the reliability, availability, and performance of the bank's services.

Promote and embed SRE principles across engineering and operations teams.

Operational Reliability

Establish and maintain Service Level Objectives (SLOs) and Service Level Indicators (SLIs) to measure and improve service reliability.

Oversee the development and operation of monitoring, logging, and alerting systems to detect and resolve issues proactively.

Manage incident response and post-mortem processes, driving root cause analysis and preventive actions.

Automation & Efficiency

Drive automation of operational tasks to reduce manual effort and improve efficiency.

Lead initiatives to optimize system performance, reduce latency, and enhance system resilience.

Champion the use of infrastructure as code and other modern engineering practices.

Collaboration & Stakeholder Management

Partner with development, infrastructure, and security teams to ensure seamless integration of SRE practices.

Collaborate with business units to understand priorities and ensure reliability initiatives align with their needs.

Act as the primary point of contact for SRE-related discussions with internal and external stakeholders.

Team Leadership & Development

Build, mentor, and manage a high-performing SRE team, fostering a culture of collaboration and innovation.

Drive continuous learning and skill development within the team to stay ahead of technological advancements.

Identify and address resource gaps to ensure effective delivery of SRE initiatives.

ROLE REQUIREMENTS

Bachelor’s or Master’s degree in Computer Science, Engineering, or a related field.10+ years of experience in infrastructure, operations, or site reliability engineering, with at least 3 years in a leadership role.Strong expertise in monitoring tools (e.g., Datadog, Prometheus, Grafana) and incident management platforms (e.g., PagerDuty).Experience in cloud platforms (AWS, Azure, GCP) and containerization technologies (Docker, Kubernetes).In-depth knowledge of automation tools, scripting languages, and CI/CD pipelines.Proven track record in driving system reliability, scalability, and performance improvements.Exceptional leadership and people management skills, with a focus on team development and motivation.Excellent problem-solving and analytical abilities, with strong attention to detail.Outstanding communication and stakeholder management skills.

The appointment will be made from the designated group in line with the Employment Equity Plan of Old Mutual South Africa and the specific business unit.

Oversees the execution of IT & Digital strategy that optimises employee capabilities, achieves the organisation's strategic objectives, and delivers competitive advantage.

ResponsibilitiesFunctional Strategy Formation

Lead the development and implementation of strategy for a significant area of responsibility, anticipating complex issues, challenges, and opportunities and ensuring integration with wider corporate strategy.

Enterprise Architecture

Develop a strategic architecture plan, ensuring that data features are prioritized appropriately, estimates are reliable, benefits can be realized, and design activities are proactively monitored and tracked to meet planned time frames and the overall architecture plan.

Enterprise Infrastructure Modernization

Define strategy for an enterprise architecture function that embeds digital assets and capabilities, supporting the design and implementation of digital strategy and organization, process, and policies. Lead analysis, evaluation, and development of enterprise long-term strategic and operating plans to ensure that the enterprise architecture is synchronized with ever-changing business needs and the complexity of digital transformation.

Infrastructure and Network Development and Maintenance

Direct and oversee infrastructure developments and maintenance to ensure business requirements can be met.

Application Software Development

Develop the most-complex existing and new applications by analyzing and identifying areas for modification and improvement. Develop new applications to meet customer requirements.

Data Management

Take responsibility for developing and delivering a key element of the organization's data management system.

Horizon Scanning

Identify new external developments and/or emerging issues within an area of technology or business function and evaluate their potential impact on, or usefulness to, the organization.

Budgeting

Manage budget plans for a department. May involve development or delivery or both.

Leadership and Direction

Identify and communicate the actions needed to implement the function's strategy and business plan within the business area or department; explain the relationship to the broader organization's mission, vision, and values; motivate people to commit to these tenets and do extraordinary things to achieve local business goals.

Organizational Capability Building

Evaluate the capabilities of staff within the department to identify gaps and prioritize development activities. Implement the organization's formal development frameworks within the area of responsibility. Coach and mentor others to support the development of the organization's talent pool.

Performance Management

Manage and report on performance within the department or area of responsibility; set appropriate performance objectives for direct reports and hold individuals accountable for achieving them; take appropriate corrective action where necessary to ensure the achievement of annual business objectives.

External Consultant or Contractor Engagement

Identify the requirement for, and participate in the selection of, external consultants or advisers to deliver key projects and/or ad hoc services; ensure business objectives and requirements are clearly understood and monitor outcomes, taking appropriate remedial action where necessary.

Skills

Action Planning, Adaptive Thinking, Business Requirements Analysis, Cultural Awareness, Database Administration, Data Compilation, Data Controls, Data Management, Evaluating Information, Executing Plans, Expertise Management System, IT Architecture, Policies & Procedures, User Requirements Documentation, Wireless Network Management

Competencies

Business Insight

Cultivates Innovation

Drives Results

Ensures Accountability

Manages Complexity

Nimble Learning

Optimizes Work Processes

Strategic Mindset

Education

Closing Date

26 December 2024 , 23:59

The appointment will be made from the designated group in line with the Employment Equity Plan of Old Mutual South Africa and the specific business unit in question.

Old Mutual Limited is pro-vaccination and encourages its workforce to be fully vaccinated against Covid-19.

All prospective employees are required to disclose their vaccination status as part of the recruitment process.

Please refer to the Old Mutual’s Covid-19 vaccination policy for further detail. Kindly note that Old Mutual reserves the right to reinstate the requirement to vaccinate at any point if it is of the view that it is imperative to do so.

The Old Mutual Story!

Confirm your E-mail: Send Email