Tyk API Management

Tyk API Management

Tyk is a leading API Management Platform that enables interconnectivity between systems and devices through its fast, scalable, and open-source API Gateway, Analytics, Dev Portal, and Dashboard.

Internet Software & Services
51-250
Founded 2015
$40M raised

Description

  • Maintain Tyk Cloud availability and help define SLA/SLO/SI targets.
  • Identify reliability issues and work with the squad to resolve them.
  • Create and improve metrics and dashboards to monitor platform health.
  • Participate in the on-call rotation and serve as first-line incident management support.
  • Conduct post-incident analysis and help define response processes.
  • Automate common operational tasks and improve support workflows.
  • Document operational knowledge, SRE processes, and policies.
  • Support the expansion of the platform across multi-region and multi-cloud environments.
  • Recommend and implement ways to improve operational efficiency and reduce running costs without affecting service.
  • Assist with cloud penetration testing by coordinating with the provider and preparing technical details and environment setup.

Requirements

  • Experience launching and operating production-scale Kubernetes clusters.
  • Experience designing and operating infrastructure on AWS and other cloud providers.
  • Experience operating MongoDB or similar document databases.
  • Experience operating Redis or similar key-value storage clusters.
  • Experience administering Linux servers and maintaining distributed software.
  • Experience operating Prometheus, Grafana, and logging collection/analysis systems.
  • Strong collaboration skills and a proactive, energetic, innovative, change-oriented mindset.
  • Advanced knowledge of Kubernetes and containers, AWS/EKS, and Linux.
  • Proficient with Terraform and infrastructure as code, and Helm.
  • Familiarity with Go, monitoring tools such as Thanos, and networking concepts including subnets, routing, peering, load balancing, NAT, DNS, TCP/IP, HTTP, TLS, and UDP.
  • Availability to participate in the on-call rotation, including 16:00–4:00 UTC.
  • Nice to have: experience with GCP or Azure, bare metal infrastructure, API management, large-scale distributed storage, Rancher, CKA/CKAD/CKS certifications, or production software delivery in Go.

Benefits

  • Unlimited paid holiday.
  • Remote working from anywhere in the world.
  • Flexible working hours.
  • Employee share scheme.
  • Generous maternity and paternity leave.
  • Company retreats.
  • An inclusive, values-driven culture that emphasizes authenticity, respect, responsibility, independence, honesty, diversity, and inclusion.

Interested in this position?

Apply directly on the company website

Apply Now

Similar Roles

Staff Operations Engineer

Mozilla 251-1K Internet Software & Services

Mozilla is hiring a Staff Operations Engineer to lead the design, reliability, and evolution of hybrid-cloud and workplace infrastructure across teams.

Ansible DNS Linux Puppet Python TCP/IP Unix
12 hours, 49 minutes ago

Principal Site Reliability Engineer (SRE)

Symmetrio Professional Services

Symmetrio is recruiting a Principal Site Reliability Engineer for a rapidly growing healthcare technology company to own the reliability, scalability, security, and performance of a mission-critical SaaS platform used by healthcare providers across the United States.

Active Directory AWS CI/CD Datadog Django Grafana Kubernetes Python Terraform Windows Server
13 hours, 5 minutes ago

Performance Test Engineer Lead

PartnerOne 51-250 Media

An enterprise performance engineering role at a cloud-focused organization, responsible for validating the scalability, stability, and production readiness of distributed systems across Azure and hybrid environments.

Azure CI/CD Kubernetes PowerShell
13 hours, 20 minutes ago

Site Reliability Engineer

MLabs 11-50 Internet Software & Services

Remote UK-hours Site Reliability Engineering role at a financial technology company, focused on automating and operating the infrastructure that supports global integration services for financial institutions.

Active Directory Ansible AWS CI/CD GCP OAuth PostgreSQL SAML
13 hours, 35 minutes ago

You're on a roll! Sign up now to keep applying.

Sign Up

Already have an account? Log in

Used by 14,729+ remote workers