Senior Site Reliability Engineer, Security & Compliance (L3)

2 hours ago
Full-time
Senior
DevOps and Infrastructure
CoinGecko

CoinGecko

CoinGecko is a leading cryptocurrency ranking website offering a detailed evaluation of digital currencies based on various metrics.

IT Services
51-250
Founded 2014

Description

  • Review system architecture and software components with engineers and ensure consistent best practices across teams.
  • Own service reliability objectives, monitor operational metrics, and lead improvement plans to meet SLOs and SLAs.
  • Develop and maintain infrastructure tools, including infrastructure-as-code resources, to scale operations and increase team autonomy.
  • Manage, audit, and improve security controls to meet enterprise requirements and compliance standards.
  • Collaborate with legal and compliance teams to assess and manage overall risk.
  • Lead release planning activities such as canary and blue-green deployments, including test environment provisioning and ad hoc performance testing.
  • Lead incident response and post-mortems to resolve production issues, identify root causes, and prevent recurrence.
  • Develop and implement disaster recovery plans, including data recovery procedures and fault-injection simulations on production replicas.
  • Handle day-to-day operational tasks such as access onboarding/offboarding, configuration, patch management, and capacity planning.
  • Develop runbooks, documentation, and technical assets, and support periodic technical audits and cross-functional technical questions.

Requirements

  • 3 to 5 years of experience managing software deployments and production instrumentation in environments with defined SLAs and SLOs.
  • Strong knowledge of software delivery and DevOps principles.
  • Experience with cloud platforms such as AWS, Cloudflare, or GCP.
  • Experience with infrastructure-as-code tools such as Terraform or CloudFormation.
  • Strong programming and scripting skills in Python, Go, Ruby, or similar languages.
  • Bachelor’s degree in Computer Science, InfoSec, or a related field, or relevant professional certifications such as Certified DevOps Professional or AWS/GCP Solutions Architect Professional.
  • Ability to take substantial features from concept to shipping as a sole contributor.
  • Ability to work effectively on open-ended projects, evaluate multiple solutions independently, and dive deep into complex problems.
  • Strong problem-solving and communication skills, including producing structured, data-backed written analysis under pressure.
  • Experience supporting on-call rotations for 24x7 services, including troubleshooting, following runbooks, and escalating incidents.
  • Experience working in a growth-stage startup is preferred.
  • Experience building applications across different tech stacks is preferred.
  • Interest in decentralized technologies and cryptocurrency applications is preferred.

Benefits

  • Remote work flexibility, with optional office space in Malaysia and Singapore.
  • Comprehensive life and hospitalization insurance, including coverage for dependents.
  • Virtual share options, subject to terms and conditions.
  • Annual bonus, subject to terms and conditions.
  • Parking allowance on a claim basis.
  • Monthly meal allowance of RM600 or SGD400.
  • Annual learning allowance of USD500 on a claim basis.
  • Social activity allowance and an annual company offsite.

Interested in this position?

Apply directly on the company website

Apply Now

Similar Roles

E01-L03 Reliability Engineer IV

TalentWerx 11-50 Professional Services

EXPANSIA is hiring a Remote Reliability Engineer IV to support cloud platforms and services for U.S. Department of Defense and national security programs, with the main objective of improving availability, performance, monitoring, incident response, and production reliability.

Prototyping
2 hours, 15 minutes ago

Site Reliability Engineer

Capital Markets Gateway 51-250 Capital Markets

Capital Markets Gateway LLC (CMG) is hiring a remote Site Reliability Engineer in Latin America to strengthen the reliability, performance, and observability of its capital markets fintech platform used by buy-side firms and investment banks.

Azure Bash Datadog Docker Elasticsearch GitHub Grafana GraphQL JIRA Kubernetes Linux Microservices .NET OpenTelemetry PostgreSQL Prometheus Python React Redis Terraform TypeScript
2 hours, 45 minutes ago

Staff Site Reliability Engineer (Platform Reliability)

Qonto 1K-5K Banks

Qonto is hiring a Staff Site Reliability Engineer to lead platform reliability work, shape infrastructure decisions, and help scale its cloud platform for millions of customers across Europe.

Argo CD AWS Docker Elasticsearch GitLab CI GitOps Go Kafka Kubernetes Microservices OpenTelemetry OpsGenie PostgreSQL Prometheus Python Redis Terraform
3 hours, 15 minutes ago

Incident Engineer

Netomi 51-250 IT Services

Netomi is hiring a remote Incident Engineer in Gurugram to manage end-to-end incident response for its enterprise AI customer experience platform and keep customer- and internal-facing systems running reliably.

AWS Datadog LLM
3 hours, 45 minutes ago

You're on a roll! Sign up now to keep applying.

Sign Up

Already have an account? Log in

Used by 14,729+ remote workers