Megaport

Megaport

Megaport simplifies network connectivity with scalable bandwidth for cloud connections, metro ethernet, and Data Centre backhaul. Offering extensive coverage in APAC and expanding globally, Megaport empowers users to manage their networks through its u...

Diversified Telecommunication Services
251-1K
Founded 2013
$26M raised

Description

  • Improve production reliability and system resilience within an SRE-scoped team.
  • Champion DevOps and SRE best practices and high standards of work.
  • Communicate with teams and stakeholders during requirements analysis, demonstrations, and delivery.
  • Investigate and resolve complex technical problems across multiple technologies.
  • Participate in on-call rotation, incident response, and blameless post-incident reviews.
  • Write code, handle alerts, improve solutions, and support other team members.
  • Work across globally distributed time zones in a self-directed, collaborative environment.
  • Contribute fresh ideas and help drive customer success and company goals.

Requirements

  • 5+ years administering Linux systems and related infrastructure in production environments.
  • Strong understanding of SRE concepts such as SLIs, SLOs, SLAs, error budgets, blast radius, and blameless postmortems.
  • Focus on automation, toil reduction, and preventing recurring issues.
  • Experience writing effective runbooks for a broader team.
  • Strong Kubernetes and ecosystem fundamentals.
  • Cloud infrastructure experience, with AWS strongly preferred; bare-metal experience is a bonus.
  • Strong tool development skills in Bash, plus Python or Go preferred.
  • Infrastructure-as-code experience, with Terraform preferred.
  • CI/CD and version control experience, with GitHub preferred.
  • Database experience with Postgres, Cassandra, or ClickHouse preferred.
  • Experience operating production observability stacks across metrics, logs, and traces.
  • Comfort working on live production infrastructure with strong troubleshooting and incident ownership.
  • A history of continual professional development.
  • Self-directed and comfortable working asynchronously with a globally distributed team.
  • Experience picking up adjacent work when needed.

Benefits

  • Flexible working environments.
  • Birthday leave.
  • Generous study and training allowance plus 5 days of paid study leave.
  • Creative, fun, and contemporary workspaces.
  • A motivated team of industry experts and new talent.
  • Recognition through ‘Legend’ and ‘Kudos’ awards.
  • Health and wellness program.

Interested in this position?

Apply directly on the company website

Apply Now

Similar Roles

Staff Site Reliability Engineer

Obsidian Security 51-250 Internet Software & Services

Obsidian Security is seeking a Staff Site Reliability Engineer to define and drive reliability strategy for its multi-tenant SaaS security platform serving enterprise and financial customers.

Argo CD AWS GCP Grafana Helm Kubernetes Microservices Prometheus
1 hour, 37 minutes ago

Senior Database Reliability Engineer

Sezzle 251-1K Diversified Financial Services

Sezzle is hiring a Senior Database Reliability Engineer to design and scale the database platform behind its applications, with a focus on making database usage safer, more reliable, and easier for developers across the company.

AWS CI/CD Datadog Elasticsearch Encryption Git GitLab Go Grafana Helm Kubernetes Microservices MySQL New Relic OpenTelemetry PostgreSQL Prometheus Python React React Native Secrets Management Terraform TypeScript
1 hour, 53 minutes ago

Site Reliability Engineer (SRE)

Valstro 11-50 Internet Software & Services

Valstro is seeking a remote Site Reliability Engineer to support its cloud-native trading platform by improving reliability, availability, performance, and deployment operations across production and UAT systems.

AWS Azure Bash Datadog Docker GCP Go Grafana Kubernetes Prometheus Python Terraform
3 hours, 16 minutes ago

Senior Database Reliability Engineer

Sezzle 251-1K Diversified Financial Services

Sezzle is hiring a Senior Database Reliability Engineer to design, build, and scale the database platform that supports its applications and helps teams use databases more reliably, securely, and efficiently.

AWS CI/CD Datadog Elasticsearch Encryption Git Go Grafana GraphQL Helm Kubernetes Microservices MySQL New Relic OpenTelemetry PostgreSQL Prometheus Python React React Native REST API Secrets Management Terraform TypeScript
4 hours, 46 minutes ago

You're on a roll! Sign up now to keep applying.

Sign Up

Already have an account? Log in

Used by 14,729+ remote workers