DevOps / Site Reliability Engineer

6 hours, 55 minutes ago
Full-time
Junior
DevOps and Infrastructure
OKX

OKX

OKX operates as a leading cryptocurrency exchange, providing users with a platform to buy, sell, and trade various digital assets such as Bitcoin, Ethereum, and XRP, while also offering tools for exploring Web3, decentralized finance (DeFi), and non-fu...

Diversified Financial Services
1K-5K
Founded 2017

Description

  • Build and maintain the core infrastructure of the AIOps platform, including unified monitoring and alerting systems and the FinOps cost observability platform.
  • Maintain and continuously optimize internal R&D infrastructure such as GitLab, Nexus, and Sonar.
  • Manage monitoring data collection, alert governance, and cost data visualization across Alibaba Cloud and AWS environments.
  • Support cloud security operations, including cloud security alert management and compliance auditing.

Requirements

  • 3+ years of DevOps or SRE experience.
  • Experience with AIOps or observability platform development is a plus.
  • Proficiency in Python.
  • Familiarity with at least one of Go or Java.
  • Full-stack capability with React/Vue frontend and backend API development is a plus.
  • Hands-on experience with at least one major cloud platform, preferably Alibaba Cloud or AWS.
  • Familiarity with cloud monitoring products such as CloudWatch or Alibaba Cloud CloudMonitor, as well as cost management tools.
  • Experience with monitoring and logging stacks such as Prometheus, Grafana, and ELK.
  • Experience maintaining and optimizing CI/CD toolchains such as GitLab CI, Nexus, and container registries.
  • Experience with AI/LLM application development, including LLM API integration, RAG, or agent frameworks, is a plus.
  • Good written and verbal English communication skills.
  • Current right to work in Singapore; visa sponsorship is not provided.

Benefits

  • Competitive total compensation package.
  • Learning and development programs plus an education subsidy.
  • Team-building programs and company events.
  • Wellness and meal allowances.
  • Comprehensive healthcare schemes for employees and dependants.

Interested in this position?

Apply directly on the company website

Apply Now

Similar Roles

Senior Site Reliability Engineer (DevTools)

Nebius 51-250 Internet Software & Services

Nebius is hiring an SRE for its DevTools team to maintain and improve large-scale developer infrastructure that supports builds, artifacts, and version control workflows for its AI cloud platform.

CI/CD GitLab Go Java Kotlin Python Ruby Spring TeamCity
23 minutes ago

Agentic AI DevOps Senior

MUTT DATA 51-250 Internet Software & Services

Agentic AI DevOps Senior at a remote startup will build and operate an enterprise-grade platform that standardizes how developers create, deploy, and manage AI agents across the organization.

AWS Azure CI/CD Databricks dbt GCP Generative AI Kubernetes Linux LLM Python Terraform
1 hour, 10 minutes ago

Dev Infra Software Engineer, Air Defense

Anduril Industries 1K-5K Aerospace & Defense

Anduril Industries is seeking a DevInfra-focused backend engineer to help build and support infrastructure, tooling, and services for a high-stakes defense program.

AWS Azure C++ CI/CD Git Go Java JavaScript Python Rust Terraform TypeScript
2 hours, 8 minutes ago

Senior Developer Experience Engineer

Huntress 251-1K Professional Services

Huntress is hiring a Senior Platform Engineer, Developer Experience to advance the internal developer platform and infrastructure supporting its security platform and endpoint agents in a remote US environment.

Azure CI/CD CircleCI Datadog GitHub Actions Go New Relic Python Ruby
3 hours, 9 minutes ago

You're on a roll! Sign up now to keep applying.

Sign Up

Already have an account? Log in

Used by 14,729+ remote workers