Xsolla

Xsolla

Xsolla is an international payment solution provider for online games, offering tools to launch, monetize, and scale games worldwide with local payment methods and fraud prevention.

Internet Software & Services
251-1K
Founded 2005

Description

  • Serve as the primary dashboard monitor during shifts by watching the GTO Operational Dashboard in Datadog and detecting anomalies across APM, logs, metrics, synthetic tests, and RUM.
  • Triage and investigate production incidents by creating tickets in JIRA Service Management, analyzing traces, logs, infrastructure and application metrics, and routing issues to the appropriate team.
  • Own lower-severity incidents end-to-end from detection through resolution, executing runbook procedures and escalating when thresholds are exceeded or code changes are required.
  • Support the TSO Lead during major incidents by surfacing real-time technical data, maintaining incident timelines, linking evidence, and executing mitigation actions.
  • Draft internal and external incident communications, including Slack updates, stakeholder notifications, and customer-facing status page posts.
  • Analyze recurring incidents, production bugs, and trends using Datadog, JIRA, and Slack, and contribute findings to reports for product and engineering teams.
  • Publish periodic health reports for critical applications and prepare incident timelines and initial PIR drafts.
  • Track PIR action items after review sessions and flag overdue items to the TSO Lead.
  • Build and maintain operational automation such as alert enrichment scripts, incident templates, Slack workflows, and dashboard widgets.
  • Conduct structured shift handoffs and participate in knowledge transfer sessions to expand independent resolution capability.

Requirements

  • 4+ years of experience in SRE, DevOps, production operations, NOC, or technical operations in a high-availability environment.
  • Experience supporting payments, e-commerce, SaaS, or gaming workloads is preferred.
  • Strong troubleshooting and investigation skills with the ability to trace issues through logs, APM, infrastructure metrics, database queries, and network paths.
  • Hands-on experience with Datadog or a similar observability platform such as Grafana, Splunk, New Relic, or Elastic.
  • Proficiency in at least one scripting language: Python, Go, or Bash.
  • Clear written and verbal communication skills in English for incident tickets, shift handoffs, status updates, and PIR drafts.
  • Working knowledge of Kubernetes and cloud infrastructure, with GCP preferred and AWS/Azure acceptable.
  • Understanding of SLOs, error budgets, and multi-window burn-rate alerting.
  • Experience with incident management tools such as JIRA Service Management, PagerDuty or OpsGenie, Slack, and Confluence.
  • Comfort with 24x7 shift-based operations in a follow-the-sun model, including rotating weekend on-call.
  • Experience in gaming, payments, or fintech environments is a plus.
  • Familiarity with Datadog Service Catalog, synthetic monitoring, and RUM is a plus.
  • Experience debugging distributed systems and tracing failures across microservices is a plus.
  • Exposure to MySQL, PostgreSQL, Redis, or Kafka is a plus.
  • Familiarity with CI/CD and deployment tools such as GitLab CI, ArgoCD, or Helm is a plus.
  • JIRA Service Management administration experience is a plus.
  • ITIL Foundation certification is a plus but not required.

Benefits

  • Salary range of RM144,000 to RM216,000 per year.
  • Latest Mac workstation and additional hardware provided for work.
  • Free trainings and participation in specialized conferences.
  • Rich internal knowledge sharing and collaboration opportunities.
  • Health insurance covering medical, dental, and optical care for employees and dependants.
  • Flexible hours to organize your day around your needs and team demands.
  • No dress code.
  • Comfortable, new office environment.

Interested in this position?

Apply directly on the company website

Apply Now

Similar Roles

Partner Operations Specialist

Plusgrade 251-1K Consumer Services

Plusgrade is hiring a Partner Operations Specialist to build scalable processes, data foundations, and automation that help Partner Success onboard, support, and grow partners more efficiently across the lifecycle.

JSON Salesforce SQL
1 hour, 23 minutes ago

Senior Transportation Specialist

ShipBob 251-1K Air Freight & Logistics

ShipBob is hiring a remote Australia-based Senior Transportation Specialist to coordinate carrier, final mile, and freight operations, ensuring reliable execution and timely issue resolution across assigned sites.

Power BI
1 hour, 57 minutes ago

Experienced Heavy Body Technician

Carvana 10K-50K Automotive

Carvana is hiring an Experienced Heavy Body Technician to perform extensive autobody repair work on multiple panels at its vehicle inspection and reconditioning centers.

2 hours, 41 minutes ago

Estimator (Civil Infrastructure) - 217

D2B Professional Services

Estimator (Civil Infrastructure) at an Australian client, responsible for preparing accurate construction cost estimates and bid documentation while coordinating project details across operations, engineering, and subcontractor teams.

Salesforce
3 hours, 4 minutes ago

You're on a roll! Sign up now to keep applying.

Sign Up

Already have an account? Log in

Used by 14,729+ remote workers