Site Reliability Engineer (Remote) - #35039

1 month, 2 weeks ago
Full-time
Mid Level
DevOps and Infrastructure
Recruitment & Search Agency - Headhunter in the Philippines

Recruitment & Search Agency - Headhunter in the Philippines

Manila Recruitment is a top recruitment agency in the Philippines, offering hiring solutions for executive search, IT, developers, managers, and specialized roles. With a database of over 250,000 candidates, we provide innovative headhunting services a...

Professional Services
11-50
Founded 2010

Description

  • Monitor the platform using Cloud Run logs, Temporal workflow UI, GKE pod status, and Pub/Sub queue states.
  • Triage issues to determine whether problems originate in the Python agent layer, Temporal workflows, Go APIs, or Vue frontend.
  • Investigate and resolve paralegal-facing operational issues such as stuck cases, failed faxes, and pending qualifications.
  • Use SQL against AlloyDB PostgreSQL to support troubleshooting and issue investigation.
  • Write and maintain runbooks and escalation procedures for recurring incidents and support workflows.
  • Support integrations across fax, email, SMS/voice, authentication, and external legal or healthcare systems.
  • Work closely with the platform components across backend, workflow, infrastructure, and data services to keep operations running smoothly.

Requirements

  • Experience troubleshooting production systems across logs, workflows, pods, queues, APIs, and UI layers.
  • Comfort working with SQL against PostgreSQL or similar databases.
  • Familiarity with cloud-based infrastructure and services such as GCP, Cloud Run, GKE, Pub/Sub, Redis, and Terraform.
  • Ability to diagnose issues in Python services, Go microservices, and web applications.
  • Experience writing runbooks, support documentation, or escalation procedures.
  • Legal operations, litigation support, or similar domain experience is a bonus.
  • Understanding of integration-based workflows with external systems such as fax, email, SMS/voice, or CRM/CMS tools is preferred.

Interested in this position?

Apply directly on the company website

Apply Now

Similar Roles

Senior Site Reliability Engineer (Remote Build)

Remote 251-1K Professional Services

Remote is hiring a Senior Site Reliability Engineer for Remote Build to own the reliability, security, and operational strategy behind its global employment infrastructure platform.

AWS Bash CI/CD Datadog Elixir GitHub Actions GitLab Go Grafana Java Jenkins Kubernetes Linux Microservices Node.js Prometheus Python Terraform
5 hours, 57 minutes ago

Senior Site Reliability Engineer (Remote Build)

Remote 251-1K Professional Services

Remote is hiring a Senior Site Reliability Engineer to own the reliability, security, and operational strategy for Remote Build’s global infrastructure platform supporting AI-driven HR and Finance integrations.

AWS Bash CI/CD Datadog Elixir GitHub Actions GitLab Go Grafana Java Jenkins Kubernetes Linux Microservices Node.js Prometheus Python Terraform
6 hours, 57 minutes ago

Sr. Site Reliability Engineer III (6448)

MetroStar 251-1K IT Services

MetroStar is hiring a Sr. Site Reliability Engineer III to support mission-critical federal government workloads and developer tooling in a highly secure, operational environment.

Ansible AWS Bash CI/CD Kubernetes Load Balancing
1 day, 6 hours ago

NoSQL Database Engineer II

LivePerson 1K-5K Internet Software & Services

LivePerson is hiring a NoSQL Database Engineer (L2) in India to support production reliability and platform engineering for large-scale NoSQL systems and cloud infrastructure.

Bash Cassandra Couchbase GCP Go Grafana Prometheus Python Redis Terraform
2 days, 6 hours ago

You're on a roll! Sign up now to keep applying.

Sign Up

Already have an account? Log in

Used by 14,729+ remote workers