OfficeSpace Software

OfficeSpace Software

OfficeSpace Software is the world's leading workplace management platform that provides a complete solution for the allocation and management of company workspaces. With a focus on efficiency and user-friendly tools, OfficeSpace empowers organizations ...

Internet Software & Services
251-1K
Founded 2004
$150M raised

Description

  • Drive measurable improvements in latency, throughput, and availability across a large-scale production environment.
  • Own system performance across Linux internals and Kubernetes scheduling, and eliminate bottlenecks before customers are impacted.
  • Define and enforce SLIs, SLOs, and error budgets to balance speed, reliability, and growth.
  • Partner with application engineers to profile code paths, improve execution efficiency, and harden services under real load.
  • Lead database performance optimization across queries, indexing, replication, and workload isolation.
  • Design and oversee AI-assisted load testing, stress testing, and capacity planning workflows.
  • Guide the migration from monolithic deployments to multi-tenant Kubernetes platforms.
  • Reduce infrastructure spend through architectural decisions, right-sizing, and intelligent scaling strategies.
  • Build and supervise automation for infrastructure provisioning, configuration management, and observability.
  • Set operational standards for reliability, performance, and incident response for production systems.

Requirements

  • 7+ years of experience operating and evolving large-scale production systems.
  • Deep Linux systems expertise with hands-on performance tuning across CPU, memory, disk, and networking.
  • Strong Python skills for automation, tooling, and AI-assisted systems workflows.
  • Production experience with Ruby/Rails ecosystems, including Puma and Sidekiq.
  • Proven ability to diagnose and resolve complex database performance issues in MySQL/MariaDB or PostgreSQL.
  • Advanced Kubernetes experience, including workload sizing, scheduling, and multi-tenant operations.
  • Infrastructure-as-code experience with Terraform and Terragrunt.
  • Experience with configuration management tools such as Puppet or Ansible.
  • Strong observability experience across metrics, logs, and traces using tools like Prometheus, Grafana, Datadog, or ELK.
  • AI fluency and comfort supervising AI agents for analysis, testing, and reporting, and validating their outputs.
  • Preferred background scaling and refactoring monolithic applications under real production load.
  • Preferred background extracting databases or other stateful components from monoliths.
  • Preferred background with Apache and Nginx tuning at scale.
  • Preferred background in Redis performance optimization and operational management.
  • Preferred background with CI/CD systems and GitOps workflows, including ArgoCD.
  • Preferred background with cloud cost optimization and FinOps-aligned operational practices.

Benefits

  • Competitive benefits packages globally.
  • Benefits designed to support health, well-being, and financial security.
  • Autonomy and ownership in a trust-based environment.
  • Opportunities for growth, learning, and professional development.
  • A collaborative, results-driven culture.
  • A fast-paced, innovation-focused environment that embraces AI and change.

Interested in this position?

Apply directly on the company website

Apply Now

Similar Roles

Senior Manager, Engineering

Sumo Logic 251-1K Internet Software & Services

Sumo Logic is hiring a Senior Manager, Engineering for Application Security to lead global programs that improve product security, reliability, and operational efficiency across its cloud platform.

Agile AWS C++ Docker GCP Java Kafka Kubernetes OWASP Ruby Scala SIEM
15 hours, 12 minutes ago

Staff Software Engineer - Databases SRE | Sweden | Remote

Grafana 1K-5K IT Services

Grafana Labs is hiring a Staff Software Engineer, SRE to improve the reliability and scalability of Grafana Cloud’s database products for high-value customers across AWS, GCP, and Azure.

AWS Azure GCP Go Helm Java Kubernetes Linux Microservices Python Terraform
1 day, 14 hours ago

Senior Site Reliability Engineer (SRE)

Oowlish 51-250 Internet Software & Services

Oowlish is hiring a Senior Site Reliability Engineer to own the reliability and operational excellence of business-critical production systems for international clients in a remote, collaborative environment.

AWS Datadog Go Heroku Kubernetes PostgreSQL Python SQL Server TypeScript
1 day, 14 hours ago

Staff Software Engineer - Databases SRE | Spain | Remote

Grafana 1K-5K IT Services

Grafana Labs is hiring a Staff Software Engineer - SRE to strengthen the reliability of its cloud database products for high-SLA customers across AWS, GCP, and Azure.

AWS Azure GCP Go Helm Java Kubernetes Linux Python Terraform
1 day, 14 hours ago

You're on a roll! Sign up now to keep applying.

Sign Up

Already have an account? Log in

Used by 14,729+ remote workers