Capital.com

Capital.com

Capital.com is a leading fintech company providing online trading services through a smart investment app, offering access to 3700+ global markets with AI-powered features for secure and efficient trading.

Capital Markets
251-1K
Founded 2016
$25M raised

Description

  • Design, deploy, and maintain scalable cloud infrastructure on AWS with high availability, performance, and security.
  • Own and evolve Kubernetes cluster management, including bare-metal deployments, and support reliable containerised workloads with Docker and Helm.
  • Build and maintain CI/CD pipelines using GitLab CI and GitOps workflows with FluxCD or ArgoCD.
  • Define, manage, and review Infrastructure as Code using Terraform.
  • Lead monitoring and observability efforts, including dashboards, alerting, and log pipelines with VictoriaMetrics/Prometheus, Grafana, and the ELK stack.
  • Operate and optimize Apache Kafka ecosystems, including Strimzi, Kafka Connect, and MirrorMaker.
  • Drive incident response, root cause analysis, and post-mortem practices to improve reliability.
  • Collaborate with Engineering, Security, and Product teams to embed DevOps best practices across the organisation.
  • Mentor and guide junior engineers to raise the engineering bar for infrastructure reliability and automation.

Requirements

  • 6+ years of hands-on experience in a DevOps or SRE role.
  • Strong knowledge of AWS services, including VPC, EC2, EKS, S3, ECR, EBS, RDS, ElastiCache, IAM, KMS, Secrets Manager, SSM Parameter Store, CloudWatch, MSK, SNS, SQS, Route 53, Direct Connect, Transit Gateway, and ELB/ALB/NLB.
  • Solid Linux administration skills with a deep understanding of system internals.
  • Deep expertise in Kubernetes, including bare-metal cluster deployment and day-2 operations.
  • Proficiency with Docker and Helm.
  • Hands-on experience with Terraform as a primary Infrastructure as Code tool, including writing, reviewing, and maintaining production-grade modules.
  • Proven experience with GitLab CI for building and maintaining CI/CD pipelines; familiarity with GitOps practices using FluxCD or ArgoCD.
  • Strong background in monitoring and observability with VictoriaMetrics or Prometheus, Grafana, and the ELK stack.
  • Experience operating and managing Apache Kafka ecosystems, including Strimzi, Kafka Connect, and MirrorMaker.
  • Experience with Ansible for configuration management; AWX experience is a plus.
  • Proficiency in scripting and automation with Bash, Python, and Go.
  • Strong communication skills and the ability to collaborate cross-functionally in a fast-paced, regulated environment.
  • English language proficiency.

Benefits

  • Competitive salary.
  • Hybrid work arrangement with flexibility to work remotely.
  • Generous annual leave.
  • Employee referral program.
  • Comprehensive health and pension benefits, including medical insurance and pension plans.
  • 30 extra days per year to work remotely from anywhere in the world, subject to restrictions.
  • Two additional paid volunteer days each year.

Interested in this position?

Apply directly on the company website

Apply Now

Similar Roles

Lead DevOps Engineer - Developer Productivity

HighLevel 251-1K Internet Software & Services

HighLevel is hiring a Lead DevOps Engineer for its Developer Productivity platform team in India to improve CI/CD reliability, developer workflow efficiency, and key delivery metrics across a large-scale remote-first system.

Bash CI/CD Docker GitHub Actions Groovy Jenkins Kubernetes Node.js Python SonarQube
3 hours, 39 minutes ago

Site Reliability Engineer (Senior or Staff), Atlas

MongoDB 1K-5K Internet Software & Services

MongoDB is hiring a Senior Site Reliability Engineer for its Atlas team to help support, maintain, and grow a multi-cloud platform for customer-facing production workloads.

AWS Azure DNS GCP Go HTTP Linux Python Ruby TLS
4 hours, 49 minutes ago

Manager, Software Engineering (Resilience Engineering)

Affirm 1K-5K Diversified Financial Services

Affirm is seeking an Engineering Manager to lead its Resilience Engineering team, building production load testing and chaos engineering capabilities that improve the safety and reliability of production systems.

AWS Java Kotlin Kubernetes Microservices Python
4 hours, 58 minutes ago

Site Reliability Engineer (Senior or Staff), Storage Layer Services (SLS)

MongoDB 1K-5K Internet Software & Services

MongoDB’s Storage Layer Services team is hiring a Site Reliability Engineer to help re-architect the cloud storage layer for Atlas and ensure the reliability and operational safety of its distributed storage infrastructure.

AWS Azure DNS GCP Go Kubernetes Linux Python TCP/IP TLS
5 hours, 46 minutes ago

You're on a roll! Sign up now to keep applying.

Sign Up

Already have an account? Log in

Used by 14,729+ remote workers