Capital.com

Capital.com is a leading fintech company providing online trading services through a smart investment app, offering access to 3700+ global markets with AI-powered features for secure and efficient trading.

Capital Markets

Financials

251-1K (780)

Founded 2016

$25M raised

5 open positions

Links

View All Jobs

Senior SRE Engineer (Observability Focus)

1 month ago

Bulgaria, Poland, Cyprus

Full-time

Senior

Site Reliability Engineer (SRE)

Software Development

Ansible Argo CD AWS Bash Elasticsearch Fluentd GitOps Grafana Helm Java JavaScript Kafka Kubernetes OpenSearch OpenTelemetry Prometheus Python Terraform TypeScript

Apply Now

Capital.com

Capital Markets

251-1K

Founded 2016

$25M raised

View All Jobs 5

Description

Own the full observability stack for metrics, logs, and traces, from pipeline design through day-2 operations.
Architect and operate the VictoriaMetrics cluster topology, including scraping, remote write, alerting rules, and cardinality control.
Operate OpenSearch clusters, including index lifecycle management, hot-warm-cold architecture, shard tuning, and ingest pipelines.
Build and maintain OpenTelemetry Collector pipelines and instrument services across Java, Python, and JavaScript/TypeScript stacks.
Run Kafka as the telemetry transport layer, including topic design, partition strategy, lag monitoring, and throughput tuning.
Manage log shipping infrastructure with Fluent Bit, Vector, or Fluentd and define structured logging standards across services.
Build Grafana dashboards and alerting that are clear, actionable, and useful for engineering teams.
Improve sampling, batching, and context propagation strategies across distributed services.
Participate in incident response, post-mortems, and reliability improvements driven by observability signals.
Mentor engineers on observability practices, tooling, and structured logging standards.

Requirements

6+ years of experience in DevOps, SRE, or platform engineering roles.
At least 2 years of experience focused on observability tooling at production scale.
Deep hands-on experience with VictoriaMetrics or Prometheus, including MetricsQL/PromQL, exporters, service discovery, remote write, downsampling, and retention management.
Solid OpenSearch or Elasticsearch experience, including cluster operations, Query DSL, ISM policies, and ingest pipeline design.
Production experience with OpenTelemetry, including Collector configuration, OTLP, context propagation, and instrumentation across multiple languages.
Strong Kafka experience, including producer/consumer patterns, consumer group management, Kafka Connect, Schema Registry, and JMX-based monitoring.
Experience with Strimzi is a plus for running Kafka on Kubernetes.
Proficiency with log shippers such as Fluent Bit, Vector, or Fluentd and structured log parsing/normalization.
Working knowledge of Kubernetes, Helm, Argo CD/GitOps, Terraform, and Ansible.
Comfort in a hybrid AWS and on-prem environment, with solid networking knowledge as it applies to scraping and shipping pipelines.
Scripting ability in Bash or Python for automation and tooling.
Strong communication skills and English proficiency.

Benefits

Competitive salary.
Flexible work-life harmony with a hybrid work setup.
Generous annual leave.
Employee referral program.
Comprehensive health and pension benefits, including medical insurance.
30 extra days to work remotely from anywhere in the world, with some restrictions.
Two additional paid volunteer days each year.

Interested in this position?

Apply directly on the company website

Apply Now

Similar Roles

Senior Site Reliability Engineer

Counterpart Health 51-200 hospital & health care

Counterpart Health is hiring a Senior Site Reliability and Infrastructure Engineer to support and evolve the technology platform behind its primary care tool and maintain reliable infrastructure for domestic and international workloads.

United States Full-time Senior Site Reliability Engineer (SRE)

$160k-$208k

AWS Azure CI/CD Containerd DNS Docker GCP Go gRPC Helm Kubernetes Linux Load Balancing Prometheus Python Shell Scripting TCP/IP

16 hours, 46 minutes ago

Apply

16 hours, 46 minutes ago

Senior Test Platform & Reliability Engineer - Star Trek Fleet Command

Scopely 1K-5K Internet Software & Services

Scopely is hiring a Senior Test Platform & Reliability Engineer in Ireland to build validation, reliability, and developer enablement platforms for Star Trek Fleet Command’s large-scale live-service backend systems.

Ireland Full-time Senior SDET (Software Development Engineer in Test) Site Reliability Engineer (SRE)

AWS Bash CI/CD Docker GitLab Go Python Terraform

17 hours, 1 minute ago

Apply

17 hours, 1 minute ago

Senior Software Engineer - Databases, SRE | Canada | Remote

Grafana 1K-5K IT Services

Grafana Labs is hiring a Senior Software Engineer for its remote SRE team to improve reliability and operability of Grafana Cloud database services for high-SLA customers across AWS, GCP, and Azure.

Canada Full-time Senior Site Reliability Engineer (SRE) Software Engineer

$108k-$130k

AWS Azure GCP Go Helm Java Kubernetes Linux Microservices Python Terraform

1 day, 16 hours ago

Apply

1 day, 16 hours ago

Senior Site Reliability Engineer

Semios 51-250 Food Products

Semios Group is hiring a Senior Site Reliability Engineer to help scale, secure, and improve the reliability of its global agricultural technology platform.

Canada Full-time Senior Site Reliability Engineer (SRE)

$140k-$160k

AWS Azure Bash Buildkite CI/CD Datadog Docker Envoy GCP Git GitHub GitHub Actions GitLab Go Jenkins Kubernetes Linux NATS New Relic Prometheus Python Ruby Splunk Terraform

1 day, 17 hours ago

Apply

1 day, 17 hours ago

Capital.com

Tags

Links

Senior SRE Engineer (Observability Focus)

Capital.com

Description

Requirements

Benefits

Similar Roles

Senior Site Reliability Engineer

Senior Test Platform & Reliability Engineer - Star Trek Fleet Command

Senior Software Engineer - Databases, SRE | Canada | Remote

Senior Site Reliability Engineer

You're on a roll! Sign up now to keep applying.