Senior Site Reliability Engineer

1 week ago


Jakarta, Indonesia Shopee Full time

Department Engineering and Technology- LevelExperienced (Individual Contributor)- LocationIndonesia - JakartaThe Engineering and Technology team is at the core of the Shopee platform development. The team is made up of a group of passionate engineers from all over the world, striving to build the best systems with the most suitable technologies. Our engineers do not merely solve problems at hand; We build foundations for a long-lasting future. We don't limit ourselves on what we can or can't do; we take matters into our own hands even if it means drilling down to the bottom layer of the computing platform. Shopee's hyper-growing business scale has transformed most "innocent" problems into huge technical challenges, and there is no better place to experience it first-hand if you love technologies as much as we do.

**About the Team**:
**About Sea Labs**
- Sea Labs is at the core of the Sea platform development, supporting diverse business lines from e-commerce, supply chain, games, payment, and finance, among many others. The strong growth and unique positioning of Sea's e-commerce business, Shopee, spurred the launch of Sea Labs Indonesia. Since its inception, passionate engineers have charted the course to drive the best experience for our users in Indonesia, many of which solutions are even adapted to other regional markets.Sea's hyper-growing business scale has transformed most "innocent" problems into huge technical challenges, and there is no better place to experience world-class projects first-hand if you love technologies as much as we do. Together with our passionate and driven teams, you'll get to develop your skills, build on industry knowledge, and collaborate with global teams in a dynamic space. Browse our Sea Labs Indonesia team openings to see how you can make an impact with us.**About Team**
- Games Site Reliability Engineer (SRE) team mainly be responsible for the SRE support work for Shopee Games. Our work scope includes but is not limited to maintaining and improving the stability of our system, optimizing resources, improving efficiency and so on. Shopee Games are a set of games and gamification features that drive user engagement. Games have already become a key engagement feature for Shopee. We have a mature container management platform and various common components that are deeply used. We have a board room for growth and challenges. Welcome to join us.- Be responsible for ensuring the reliability of the business, including but not limited to monitoring and alert, incident management, business continuity management, resource and capacity management, campaign support, etc.
- Take part in the planning and development of operational tools to automate processes, improve efficiency, and reduce costs.
- Enhance the existing stability assurance system, drive the implementation of best practices and processes for SRE operations, ensuring scalability, reliability, and performance.
- Collaborate with the Dev team, provide pertinent technical solutions based on their requirements.Proactively engage in effective communication to secure their support and ensure the successful delivery of relevant projects.
- Responsible for 24/7 monitoring and response of Games business, response promptly to live incidents, quick location and recovery, to ensure business stability.

**Requirements**:

- Bachelor's degree or above in Computer Science or related fields
- Having 3+ years of experience as SRE/DevOps/System Engineer
- Expert in Shell language, better familiar with Python or Go language, React, JavaScript also highly preferred
- In-depth understanding of Network, Linux, Traffic Scheduling
- Familiar with Jenkins, Gitlab, experienced in CI/CD process development and integration
- Familiar with commonly used middleware and databases, such as Codis,Redis, MQ,MySQL
- Familiarity in Docker/k8s including related underlying technology and principles is preferred
- Able to respond promptly to handle all fault incidents
- An effective team player with a customer service orientation
- Meticulous and attentive to detail with strong critical thinking, data analytics and problem solving capabilities
- Able to communicate effectively English to work with stakeholders in other regions
- Have a passion for reliable and performance systems, and care deeply about the end-user experience



  • Jakarta, Indonesia Amartha Full time

    Amartha is embarking on an exciting new journey and is in need of experienced engineers to work with senior management, existing engineers, and product in shaping the next wave of innovative product offerings, ensuring Amartha leapfrogs into the next phase of its journey! Job Description: As a Site Reliability Engineer (SRE) you will combines software and...


  • Jakarta, Indonesia DKatalis Full time

    **Site Reliability Engineer**: **About DKatalis** DKatalis is a financial technology company with multiple offices in the APAC region. In our quest to build a better financial world, one of our key goals is to create an ecosystem linked financial services business. DKatalis is built and backed by experienced and successful entrepreneurs, bankers, and...


  • Jakarta, Indonesia AccelByte Full time

    At AccelByte, our mission is to empower game creators by providing them with the backend platform and tools required to make scalable, reliable AAA-quality games. The company was founded in 2016 by industry veterans who have engineered online systems for some of the largest game and distribution platforms in the world including Fortnite, Epic Store, Xbox...


  • Jakarta, Indonesia Abhidi Solution Private Limited Full time

    **Responsibilities**: - Administer production related jobs - Address production issue - Improve system reliability through configuration or code changes - System monitoring and improve system observability - Remove toil and automate whenever possible - Problem solving, including troubleshoot a production issue **Skills**: - Experience with cloud...


  • Jakarta, Indonesia PT Midas Daya Teknologi Full time

    Job Description: Works with project engineering to ensure the reliability and maintainability of new and modified software. - The reliability engineer is responsible for adhering to the life cycle software management process throughout the entire life cycle. - Also responsible for end-to-end site reliability including service offerings, in particular...


  • Jakarta, Indonesia AccelByte Full time

    At AccelByte, our mission is to empower game creators by providing them with the backend platform and tools required to make scalable, reliable AAA-quality games. The company was founded in 2016 by industry veterans who have engineered online systems for some of the largest game and distribution platforms in the world including Fortnite, Epic Store, Xbox...


  • Jakarta, Indonesia Pro Sigmaka Full time

    We established at 2012. With experience in several industry sectors, a broad portfolio and technology platform as well as bringing a dedicated and highly qualified team, enabling the talent we provide to provide fast and responsive services, making it the best choice for companies that want to increase the usability of their businesses. OUR SERVICES -...


  • Jakarta, Indonesia AccelByte Full time

    At AccelByte, our mission is to empower game creators by providing them with the backend platform and tools required to make scalable, reliable AAA-quality games. The company was founded in 2016 by industry veterans who have engineered online systems for some of the largest game and distribution platforms in the world including Fortnite, Epic Store, Xbox...


  • Jakarta, Indonesia Ajaib Full time

    Company Description **Job Description**: - Perform day-to-day operations to support developers and DevOps. - Create end-to-end monitoring, logging, and alerting system. - Provide technical assistance to improve system performance, capacity, reliability and scalability - Perform root cause analysis of reliability issues. - Document every action so your...


  • Jakarta, Indonesia PT. Amalura Multi Dimensi Full time

    Manage and optimize cloud infrastructure (AWS, GCP, Azure). - Administer Linux system, ensuring stability and security. - Implement observability (e. g, OpenTelemtry, HoneyComb, Sentry) to monitor performance. - Optimize content delivery networks (e. g., Akamai) to enhance user experience. - Design monitoring, alerting, and incident response procedure for...


  • Jakarta, Indonesia Beyondsoft (Malaysia) Sdn. Bhd Full time

    COMPANY DESCRIPTION Beyondsoft (listed by the Shenzhen Stock Exchange, stock code 002649) is a global provider of IT consulting, product and solution services. Relying on strong R&D and innovation capabilities, the company widely adopts emerging technologies based on big data and mobile internet, including big data management platform, enterprise risk...


  • Jakarta, Indonesia PT Tiga Daya Digital Indonesia (Eksad Technology) Full time

    Tiga Daya Digital Indonesia, a susidiary company of Triputra Group and DCI Group To be IT partner to enable client growth rapidly. Eksad Providing Services High Quality Based on Strong Experience in the industry and technology. Building the right IT Service Solution to enable it Partners in speeding up business development based on digital technology by...


  • Jakarta, Indonesia Shipper Full time

    **What is Shipper** Shipper is a growing technology company based in Jakarta. We provide well-rounded logistics solutions for businesses of all sizes. Today, we offer several services including First-Mile Pickup and Delivery, Fulfillment/Warehouse Management, and Cross-Border shipping services. We are financially supported by eminent investors, including...


  • Jakarta, Indonesia Shipper Full time

    **What is Shipper** Shipper is a growing technology company based in Jakarta. We provide well-rounded logistics solutions for businesses of all sizes. Today, we offer several services including First-Mile Pickup and Delivery, Fulfillment/Warehouse Management, and Cross-Border shipping services. We are financially supported by eminent investors, including...


  • Jakarta, Indonesia Global Tiket Network Full time

    We think you also hate when travel app is giving you a headache, right? A slight misinformation can ruin the trip. - That is exactly what we are tackling as t-fam! Making sure that our 17+ million users have the best experience in crafting their own adventure. LI-Hybrid Catch the sunrise on the top of Padar Island and see fascinating views of the boundless...


  • Jakarta, Indonesia Digital Muda Solutions Full time

    Deskripsi: - Menjaga ketersediaan, kehandalan, dan performa sistem dengan fokus pada infrastruktur teknis, keamanan, dan skala pengguna. - Berkolaborasi dengan tim pengembangan dan operasi untuk merancang, menguji,dan menerapkan praktik terbaik dalam infrastruktur teknologi, serta melakukan perbaikan dan peningkatan sesuai kebutuhan. - Memastikan integrasi...


  • Jakarta, Indonesia Catalyst Tech Full time

    At Catalyst, People are the heartbeat for our company. We believe that good quality people will have a positive impact to our business. We are looking for a **Site Reliability Engineer / DevOps** to join our growing team. If you are passionate about being part of the team, building some of the most critical products, Working alongside teams in the industry...


  • Jakarta, Indonesia PT Salva Teknologi Digital Full time

    Site Reliability Engineer (Junior) - Applicants should have sufficient qualification and relevant experiences in the respective fields "Waspada terhadap Modus Penipuan pada saat proses interview. Perusahaan tidak akan memungut biaya apapun dalam melakukan proses interview. Mohon segera melaporkan ke kami, jika pada saat Anda diundang untuk interview dan...


  • Jakarta, Indonesia Paymentology Full time

    Paymentology is the first truly global issuer-processor, giving banks and fintechs the technology, team and experience to rapidly issue and process Mastercard, Visa and UnionPay cards across more than 50 countries, at scale. Our advanced, multi-cloud platform, offering both shared and dedicated processing instances, vast global presence and richer,...


  • Jakarta, Indonesia Hukumonline.com Full time

    Manage and optimize cloud infrastructure on AWS, GCP, and Azure. - Administer and maintain Linux-based systems, ensuring their stability and security. - Implement and maintain observability solutions, including OpenTelemetry, HoneyComb, and Sentry, to monitor system performance and diagnose issues. - Configure and optimize content delivery networks, with a...