Senior/Staff Platform Engineer

1 month ago
Full-time
Lead
DevOps and Infrastructure
VRChat

VRChat

VRChat provides a platform that enables users to create and explore immersive virtual reality experiences, allowing for social interaction and community-driven content creation through its Unity SDK.

Internet Software & Services
51-250
Founded 2014
$95M raised

Description

  • Operate and improve production infrastructure with a focus on reliability, security, performance, and cost efficiency.
  • Define, measure, and improve reliability using SLIs, SLOs, SLAs, error budgets, and DORA metrics.
  • Build and improve monitoring, alerting, dashboards, logging, and incident response processes.
  • Participate in incident management, root cause analysis, postmortems, and follow-up remediation.
  • Automate infrastructure and operational workflows using infrastructure-as-code and scripting tools.
  • Work closely with engineering teams to improve service reliability, deployment quality, and operational readiness.
  • Turn ambiguous infrastructure, reliability, and operational problems into clear, scalable, and measurable solutions.
  • Engage with backend codebases through code reviews, pull requests, and occasional feature or tooling work.
  • Plan and deploy infrastructure in collaboration with IT, Engineering, and functional leaders.

Requirements

  • 8+ years of experience in SRE, DevOps, Platform Engineering, or Infrastructure Engineering.
  • Strong experience operating high-availability production systems.
  • Experience with cloud or hybrid cloud environments and tools such as Terraform or OpenTofu.
  • Strong knowledge of Linux, networking, automation, observability, and incident management.
  • Strong communication skills and ability to work with technical and non-technical stakeholders.
  • Operational knowledge of databases such as MongoDB, Elasticsearch, or Redis.
  • Experience with AWS, including core infrastructure services, cost optimization, and multi-account architecture (preferred).
  • Experience with Kubernetes, including networking, service discovery, ingress, and workload reliability (preferred).
  • Experience with Cilium or other Kubernetes networking/security solutions (preferred).
  • Experience supporting large-scale storage systems or working with CDNs, caching, distributed systems, or real-time platforms (preferred).

Benefits

  • 100% remote work from anywhere.
  • Health benefits.
  • 401(k) for US employees and RRSP for Canadian employees.
  • Stock options.
  • Generous paid holiday schedule.
  • Unlimited/flexible vacation time.
  • Paid parental leave benefits.

Interested in this position?

Apply directly on the company website

Apply Now

Similar Roles

ServiceNow Cloud Migration Lead (Senior Consultant - Platform Engineering)

Muller Internet Software & Services

Müller’s Solutions is hiring a ServiceNow Cloud Migration Lead to manage the migration of a self-hosted ServiceNow instance to ServiceNow on GCP, overseeing the project from assessment through go-live and stabilization.

DNS GCP REST API SAML SOAP
14 hours, 45 minutes ago

Senior Engineering Manager - Enablement

Honeycomb.io 51-250 Internet Software & Services

Honeycomb is seeking an Engineering Enablement leader to drive the developer experience, AI-assisted engineering workflows, and platform foundations that help the company ship faster and more safely as it scales.

CI/CD CircleCI GitHub Actions Go JavaScript OpenTelemetry TypeScript
14 hours, 45 minutes ago

Senior Platform Engineer / Senior DevOps Engineer / Senior Infrastructure Engineer / Senior Site Reliability Engineer

Anduril Industries 1K-5K Aerospace & Defense

Anduril Australia is hiring a senior infrastructure and reliability engineer to own a service or platform end to end across cloud and classified environments supporting defense programs.

Active Directory AWS Bash Go Kubernetes Python Terraform
1 day, 13 hours ago

Platform Architect

Auraverse 1-10 Professional Services

Aura is hiring a Boston-based Platform Architect to lead the architecture and evolution of its backend platform for digital safety products serving millions of customers.

API Gateway AWS CI/CD Databricks DynamoDB GitHub Actions Serverless Snowflake Terraform
1 day, 14 hours ago

You're on a roll! Sign up now to keep applying.

Sign Up

Already have an account? Log in

Used by 14,729+ remote workers