Xsolla

Xsolla

Xsolla is an international payment solution provider for online games, offering tools to launch, monetize, and scale games worldwide with local payment methods and fraud prevention.

Internet Software & Services
251-1K
Founded 2005

Description

  • Serve as the primary dashboard monitor during shifts and continuously watch production health signals in Datadog.
  • Detect anomalies by correlating APM, logs, metrics, synthetic tests, and Real User Monitoring data.
  • Triage and investigate production incidents, create incident tickets in JIRA Service Management, and route issues to the correct team.
  • Own lower-severity incidents end-to-end from detection through resolution, including diagnosis and runbook execution.
  • Support the Technical Shift Operations Lead during major incidents as a technical partner in the war room.
  • Draft internal and customer-facing incident communications, including Slack updates and status page posts.
  • Analyze incident trends, recurring issues, and production bugs and contribute findings to reports and post-incident reviews.
  • Compile incident timelines, draft initial PIR documents, and track action items after reviews.
  • Build and maintain operational automation, incident templates, Slack workflows, dashboard widgets, and runbooks.
  • Conduct structured shift handoffs and participate in knowledge transfer sessions to improve independent resolution capability.
  • Cover for the TSO Lead when needed, including severity classification, escalation decisions, and basic incident commander functions.
  • Publish periodic health reports for critical applications.

Requirements

  • 4+ years of experience in SRE, DevOps, production operations, NOC, or technical operations in a high-availability environment.
  • Experience supporting payments, e-commerce, SaaS, or gaming workloads is preferred.
  • Strong troubleshooting and investigation skills across logs, traces, metrics, databases, and network paths.
  • Hands-on experience with Datadog or a similar observability platform such as Grafana, Splunk, New Relic, or Elastic.
  • Proficiency in at least one scripting language: Python, Go, or Bash.
  • Clear written and verbal communication skills in English.
  • Working knowledge of Kubernetes and cloud infrastructure; GCP is preferred, while AWS or Azure are acceptable.
  • Understanding of SLOs, error budgets, and burn-rate alerting.
  • Experience with JIRA or JIRA Service Management, PagerDuty or OpsGenie, Slack, and Confluence.
  • Interest in or experience with AI/ML-assisted operations such as anomaly detection, alert correlation, predictive monitoring, or automated remediation.
  • Comfort with 24x7 shift-based operations in a follow-the-sun model, including weekend on-call rotation.
  • Experience in gaming, payments, or fintech environments is a plus.
  • Familiarity with Datadog Service Catalog, synthetic monitoring, and RUM is a plus.
  • Exposure to database and platform tools such as MySQL, PostgreSQL, Redis, Kafka, GitLab CI, ArgoCD, and Helm is a plus.
  • JIRA Service Management administration experience or ITIL Foundation certification is preferred.

Interested in this position?

Apply directly on the company website

Apply Now

Similar Roles

Automotive Wheel Repair Technician

Carvana 10K-50K Automotive

Carvana is hiring an Automotive Wheel Repair Technician to perform wheel and rim repairs at its high-tech vehicle reconditioning centers.

6 hours, 35 minutes ago

Sr Phlebotomist-El Paso, Texas

Natera 1K-5K Pharmaceuticals

Natera is hiring a Senior Phlebotomist to support patient specimen collection, laboratory preparation, and related operational tasks in a clinical office setting.

HIPAA
6 hours, 35 minutes ago

Entry-Level Automotive Parts Associate

Carvana 10K-50K Automotive

Carvana is hiring an entry-level Parts Associate to support vehicle reconditioning operations at its high-tech Inspection Centers by ordering, receiving, and organizing parts for vehicles.

6 hours, 50 minutes ago

Senior TA Operations Specialist

GitLab 1K-5K Internet Software & Services

GitLab is hiring a Senior TA Technology Specialist to own and evolve the talent acquisition technology stack, including ATS operations, integrations, AI automation, and governance for recruiting at scale.

LLM
7 hours, 5 minutes ago

You're on a roll! Sign up now to keep applying.

Sign Up

Already have an account? Log in

Used by 14,729+ remote workers