Xsolla

Xsolla

Xsolla is an international payment solution provider for online games, offering tools to launch, monetize, and scale games worldwide with local payment methods and fraud prevention.

Internet Software & Services
251-1K
Founded 2005

Description

  • Serve as the primary dashboard monitor during shifts and continuously watch production health signals in Datadog.
  • Detect anomalies by correlating APM, logs, metrics, synthetic tests, and Real User Monitoring data.
  • Triage and investigate production incidents, create incident tickets in JIRA Service Management, and route issues to the correct team.
  • Own lower-severity incidents end-to-end from detection through resolution, including diagnosis and runbook execution.
  • Support the Technical Shift Operations Lead during major incidents as a technical partner in the war room.
  • Draft internal and customer-facing incident communications, including Slack updates and status page posts.
  • Analyze incident trends, recurring issues, and production bugs and contribute findings to reports and post-incident reviews.
  • Compile incident timelines, draft initial PIR documents, and track action items after reviews.
  • Build and maintain operational automation, incident templates, Slack workflows, dashboard widgets, and runbooks.
  • Conduct structured shift handoffs and participate in knowledge transfer sessions to improve independent resolution capability.
  • Cover for the TSO Lead when needed, including severity classification, escalation decisions, and basic incident commander functions.
  • Publish periodic health reports for critical applications.

Requirements

  • 4+ years of experience in SRE, DevOps, production operations, NOC, or technical operations in a high-availability environment.
  • Experience supporting payments, e-commerce, SaaS, or gaming workloads is preferred.
  • Strong troubleshooting and investigation skills across logs, traces, metrics, databases, and network paths.
  • Hands-on experience with Datadog or a similar observability platform such as Grafana, Splunk, New Relic, or Elastic.
  • Proficiency in at least one scripting language: Python, Go, or Bash.
  • Clear written and verbal communication skills in English.
  • Working knowledge of Kubernetes and cloud infrastructure; GCP is preferred, while AWS or Azure are acceptable.
  • Understanding of SLOs, error budgets, and burn-rate alerting.
  • Experience with JIRA or JIRA Service Management, PagerDuty or OpsGenie, Slack, and Confluence.
  • Interest in or experience with AI/ML-assisted operations such as anomaly detection, alert correlation, predictive monitoring, or automated remediation.
  • Comfort with 24x7 shift-based operations in a follow-the-sun model, including weekend on-call rotation.
  • Experience in gaming, payments, or fintech environments is a plus.
  • Familiarity with Datadog Service Catalog, synthetic monitoring, and RUM is a plus.
  • Exposure to database and platform tools such as MySQL, PostgreSQL, Redis, Kafka, GitLab CI, ArgoCD, and Helm is a plus.
  • JIRA Service Management administration experience or ITIL Foundation certification is preferred.

Interested in this position?

Apply directly on the company website

Apply Now

Similar Roles

Mobile Mapping Operator

TSMG Professional Services

Terry Soot Management Group (TSMG) is hiring a full-time Mobile Mapping Operator in Würzburg to collect street and public-area imagery for an EMEA field project that will help improve a widely used online map.

8 minutes ago

Standortdaten-Spezialist

TSMG Professional Services

Terry Soot Management Group (TSMG) is hiring a remote full-time field data collection specialist in Passau to capture street-level imagery and related data for map improvement projects across public roads and areas in Germany.

8 minutes ago

Mobile Mapping Operator

TSMG Professional Services

Terry Soot Management Group (TSMG) is hiring a full-time Mobile Mapping Operator to collect street, landmark, and public-area imagery in and around Steinau an der Straße for a long-term mapping project.

8 minutes ago

Data collector / Driver

TSMG Professional Services

Terry Soot Management Group is hiring a full-time field data collector/driver in Spartanburg, SC to drive assigned routes and capture street and public-area imagery for mapping projects.

8 minutes ago

You're on a roll! Sign up now to keep applying.

Sign Up

Already have an account? Log in

Used by 14,729+ remote workers