Site Reliability Engineer
1 month ago
Paymentology is the first truly global issuer-processor, giving banks and fintechs the technology, team and experience to rapidly issue and process Mastercard, Visa and UnionPay cards across more than 50 countries, at scale.
Our advanced, multi-cloud platform, offering both shared and dedicated processing instances, vast global presence and richer, real-time data, set us apart as the leader in payments.
We're on the hunt for an exceptional **Site Reliability Engineer (SRE)** to join our dedicated team. As an SRE at Paymentology, you'll be the superhero responsible for maintaining, improving, and ensuring the high availability, scalability, and performance of our platform.
**Requirements**:
**What it takes to succeed**:
- Bachelor’s Degree in Computer Science, Information Technology, or related field.
- A minimum of 3 years in a dedicated SRE role, as well as 5+ years of prior software development experience.
- Comprehensive understanding of large-scale distributed platform architecture.
- Extensive hands-on cloud experience, particularly with AWS.
- Proven experience developing scalable, modular infrastructure-as-code projects using tools such as Terraform, CloudFormation, Puppet, and Ansible.
- Practical experience with Docker and container orchestrators, including AWS ECS & EKS, and Kubernetes.
- Experience in administering or integrating identity management systems for SSO, including AWS IAM, Okta, and Active Directory.
- Experience with disaster recovery and redundancy strategies in both cloud and on-premises environments.
- Proficiency with leading monitoring tools, such as Datadog, Honeycomb.io, Splunk, Prometheus, Grafana, ELK Stack, and New Relic.
- Programming expertise, especially in systems programming languages (e.g., Java, Kotlin, Scala) and databases (e.g., SQL Server, PostgreSQL).
- Familiarity with industry-leading CI/CD tools such as Jenkins, GitHub Actions, Gitlab CI, CodePipelines, CircleCI, and ArgoCD.
- Track record of achieving platform-level and end-to-end SLIs, SLOs, and SLAs, and fostering accountability.
- Ability to navigate complex situations and lead effective post-incident reviews (PIRs).
- Knowledge of implementing solutions to reduce Mean Time to Identify (MTTI) and Mean Time to Resolve (MTTR).
- Expertise in implementing best practices for load balancing, fault tolerance, and resource allocation to maintain service quality and efficiency at scale.
- Understanding of security best practices within cloud environments.
****You'll also need to bring a collaborative mindset, working seamlessly across teams to drive innovative solutions. And of course, your exceptional communication skills in English will allow you to clearly convey your ideas and recommendations.
As a key member of our technical team, you will be expected to maintain high availability and be ready to address critical incidents, ensuring the continuous performance of our systems. This includes being part of an on-call schedule to support 24/7 operations.
**Why Paymentology?**
- Full-time remote position with flexible hours.
- An inclusive and supportive work environment that values diversity.
- A chance to work on cutting-edge technology projects that make a difference.
- Opportunities for continuous learning and development.
What you get to do:
**Platform Reliability and Scalability**:
- Build software that enhances Paymentology services' scalability and reliability.
- Ensure platform services meet required uptime and service quality levels.
- Contribute to the design of reliable cloud infrastructure and implement reusable cloud-uptime components as code.
- Regularly review and optimise SRE practices, tools, and methodologies to enhance overall system reliability and team efficiency.
**Observability and Automation**:
- Contribute to the design, implementation, and maintenance of observability and monitoring solutions to track the platform health, its cost-effectiveness, the reliability, and scalability, and identify potential issues which can be fed back to product and platform engineering in a continuous improvement loop.
- Develop and implement automation scripts and tools to streamline operations and reduce manual interventions.
- Enable product teams to self-serve by participating in the development of a developer platform.
**Production Issue Resolution**:
- Play an active role with the incident response teams, diagnosing and resolving production issues quickly to minimise downtime.
**Standards Compliance**:
- Support product teams in building services that adhere to our security and quality standards.
**Cross-team Collaboration**:
- Work closely with engineering, operations, and product teams to ensure reliability is considered throughout the end-to-end software development lifecycle. We seek to achieve this through advocacy and developing a culture of reliability.
What you can look forward to:
At Paymentology we value making a difference to the lives of the people who work for us and
-
Site Reliability Engineer
5 months ago
Jakarta, Indonesia Pro Sigmaka Full timeWe established at 2012. With experience in several industry sectors, a broad portfolio and technology platform as well as bringing a dedicated and highly qualified team, enabling the talent we provide to provide fast and responsive services, making it the best choice for companies that want to increase the usability of their businesses. OUR SERVICES -...
-
Site Reliability Engineer
4 weeks ago
Jakarta, Indonesia PT. Amalura Multi Dimensi Full timeManage and optimize cloud infrastructure (AWS, GCP, Azure). - Administer Linux system, ensuring stability and security. - Implement observability (e. g, OpenTelemtry, HoneyComb, Sentry) to monitor performance. - Optimize content delivery networks (e. g., Akamai) to enhance user experience. - Design monitoring, alerting, and incident response procedure for...
-
Site Reliability Engineer
7 months ago
Jakarta, Indonesia PT Tiga Daya Digital Indonesia (Eksad Technology) Full timeTiga Daya Digital Indonesia, a susidiary company of Triputra Group and DCI Group To be IT partner to enable client growth rapidly. Eksad Providing Services High Quality Based on Strong Experience in the industry and technology. Building the right IT Service Solution to enable it Partners in speeding up business development based on digital technology by...
-
Site Reliability Engineer
5 months ago
Jakarta, Indonesia Digital Muda Solutions Full timeDeskripsi: - Menjaga ketersediaan, kehandalan, dan performa sistem dengan fokus pada infrastruktur teknis, keamanan, dan skala pengguna. - Berkolaborasi dengan tim pengembangan dan operasi untuk merancang, menguji,dan menerapkan praktik terbaik dalam infrastruktur teknologi, serta melakukan perbaikan dan peningkatan sesuai kebutuhan. - Memastikan integrasi...
-
Site Reliability Engineer
7 months ago
Jakarta, Indonesia PT Salva Teknologi Digital Full timeSite Reliability Engineer (Junior) - Applicants should have sufficient qualification and relevant experiences in the respective fields "Waspada terhadap Modus Penipuan pada saat proses interview. Perusahaan tidak akan memungut biaya apapun dalam melakukan proses interview. Mohon segera melaporkan ke kami, jika pada saat Anda diundang untuk interview dan...
-
Site Reliability Engineer
2 months ago
Jakarta, Indonesia Hukumonline.com Full timeManage and optimize cloud infrastructure on AWS, GCP, and Azure. - Administer and maintain Linux-based systems, ensuring their stability and security. - Implement and maintain observability solutions, including OpenTelemetry, HoneyComb, and Sentry, to monitor system performance and diagnose issues. - Configure and optimize content delivery networks, with a...
-
Site Reliability Engineer
5 months ago
Jakarta, Indonesia AccelByte Full timeAt AccelByte, our mission is to empower game creators by providing them with the backend platform and tools required to make scalable, reliable AAA-quality games. The company was founded in 2016 by industry veterans who have engineered online systems for some of the largest game and distribution platforms in the world including Fortnite, Epic Store, Xbox...
-
Site Reliability Engineer
5 months ago
Jakarta, Indonesia AccelByte Full timeAt AccelByte, our mission is to empower game creators by providing them with the backend platform and tools required to make scalable, reliable AAA-quality games. The company was founded in 2016 by industry veterans who have engineered online systems for some of the largest game and distribution platforms in the world including Fortnite, Epic Store, Xbox...
-
Senior Site Reliability Engineer
6 months ago
Jakarta, Indonesia DKatalis Full time**Site Reliability Engineer**: **About DKatalis** DKatalis is a financial technology company with multiple offices in the APAC region. In our quest to build a better financial world, one of our key goals is to create an ecosystem linked financial services business. DKatalis is built and backed by experienced and successful entrepreneurs, bankers, and...
-
Site Reliability Engineer
5 months ago
Jakarta, Indonesia PT Astra Digital Mobil (mobbi) Full timeJob Description: - Maintain system availability, reliability and performance by focusing on technical infrastructure, security and user scale. - Collaborate with development and operations teams to design, test, and implement best practices in technology infrastructure, and make fixes and improvements as needed. - Conduct in-depth analysis of incidents and...
-
Site Reliability Engineer(DevOps)
6 months ago
Jakarta, Indonesia Digital Muda Solutions Full timeDeskripsi: - Menjaga ketersediaan, kehandalan, dan performa sistem dengan fokus pada infrastruktur teknis, keamanan, dan skala pengguna. - Berkolaborasi dengan tim pengembangan dan operasi untuk merancang, menguji,dan menerapkan praktik terbaik dalam infrastruktur teknologi, serta melakukan perbaikan dan peningkatan sesuai kebutuhan. - Memastikan integrasi...
-
Senior Site Reliability Engineer
5 months ago
Jakarta, Indonesia AccelByte Full timeAt AccelByte, our mission is to empower game creators by providing them with the backend platform and tools required to make scalable, reliable AAA-quality games. The company was founded in 2016 by industry veterans who have engineered online systems for some of the largest game and distribution platforms in the world including Fortnite, Epic Store, Xbox...
-
Site Reliability Engineer
5 months ago
Jakarta, Indonesia AccelByte Full timeAt AccelByte, our mission is to empower game creators by providing them with the backend platform and tools required to make scalable, reliable AAA-quality games. The company was founded in 2016 by industry veterans who have engineered online systems for some of the largest game and distribution platforms in the world including Fortnite, Epic Store, Xbox...
-
Senior Site Reliability Engineer
5 months ago
Jakarta, Indonesia AccelByte Full timeAt AccelByte, our mission is to empower game creators by providing them with the backend platform and tools required to make scalable, reliable AAA-quality games. The company was founded in 2016 by industry veterans who have engineered online systems for some of the largest game and distribution platforms in the world including Fortnite, Epic Store, Xbox...
-
Site Reliability Engineer
2 months ago
Jakarta, Indonesia Flip Full time**About Flip** Rafi, Luqman, and Anjar, who were college friends in Universitas Indonesia, started Flip as a project in 2015 to transfer payments to each other at a fraction of what banks would charge them. They are pioneers in the Indonesian market, with their technology now helping millions of Indonesians, both individuals and businesses, carry out...
-
Site Reliability Engineer
5 months ago
Jakarta, Indonesia Group Hijra (Hijra Bank, ALAMI P2PL) Full timeHijra (previously known as the ALAMI Group) is a financial technology company that follows sharia principles. It was founded in 2018 by Dima Djani, Harza Sandityo, and Bembi Juniar. The company offers a range of services, including a peer-to-peer lending platform, mobile banking app, and mortgage all of which are based on sharia principles. Hijra has a...
-
Site Reliability Engineer Manager
6 months ago
Jakarta, Indonesia AccelByte Full timeAt AccelByte, our mission is to empower game creators by providing them with the backend platform and tools required to make scalable, reliable AAA-quality games. The company was founded in 2016 by industry veterans who have engineered online systems for some of the largest game and distribution platforms in the world including Fortnite, Epic Store, Xbox...
-
Reliability Engineering Specialist
2 months ago
Jakarta, Indonesia Company: Bureau Veritas Full timeReliability Engineering Specialist**Date**:24 Sep 2024 **Location**: Jakarta, Jakarta, ID **Company**:Bureau Veritas 1. Bachelor's Degree in Mechanical Engineering, Physics Engineering, or equivalent. 2. Minimum 8 years of working experience with a focus in reliability engineering. 3. At least involved in 7 project as rotating or reliability Engineering...
-
Site Reliability Engineer I
6 months ago
Jakarta, Indonesia AccelByte Full timeAt AccelByte, our mission is to empower game creators by providing them with the backend platform and tools required to make scalable, reliable AAA-quality games. The company was founded in 2016 by industry veterans who have engineered online systems for some of the largest game and distribution platforms in the world including Fortnite, Epic Store, Xbox...
-
Site Reliability Engineer for Fita
3 months ago
Jakarta, Indonesia PT Telkomsel Ekosistem Digital Full timeFita is a health-tech platform that brings together a community of fitness enthusiasts and expert coaches. Our mission is to empower Indonesians of all fitness levels to achieve their goals, maintain a healthy lifestyle, and build lasting habits through personalized virtual coaching sessions. **What you will do but not limited to**: - Manage infrastructure...