Drivetrain

Drivetrain

Drivetrain offers a strategic finance platform designed to streamline financial planning, enhance real-time tracking of actuals, accelerate reporting processes, and support informed decision-making for finance teams in modern businesses.

Capital Markets
11-50
Founded 2021
$15M raised

Description

  • Architect, manage, and continuously optimize highly available cloud infrastructure across AWS and GCP.
  • Design, deploy, and manage scalable Kubernetes clusters and standardized deployment configurations.
  • Implement and maintain service mesh technologies to secure, control, and observe service-to-service communication.
  • Build, maintain, and optimize CI/CD pipelines with automated testing and security gates.
  • Write, review, and maintain Terraform modules to provision and manage cloud resources.
  • Develop Python scripts and tooling to automate maintenance, backups, scaling, and recovery tasks.
  • Design and enhance monitoring, logging, and alerting systems across the observability stack.
  • Own incident response, facilitate blameless postmortems, and define and enforce SLIs, SLOs, and SLAs.
  • Collaborate with software engineers to design applications for deployability, scalability, and resilience.
  • Identify system bottlenecks, contribute to process improvements, build developer tooling, and maintain documentation.

Requirements

  • 5+ years of hands-on experience in Site Reliability Engineering, DevOps, or Cloud Infrastructure roles, preferably in a fast-paced SaaS environment.
  • Deep experience with AWS services including EC2, EKS, RDS, VPC, IAM, and S3.
  • Deep experience with GCP services including GKE, Compute Engine, Cloud SQL, IAM, and Cloud Storage.
  • Expert-level knowledge of Docker and Kubernetes, including advanced deployment strategies and lifecycle management.
  • Strong programming skills in Python and extensive experience with Terraform.
  • Hands-on experience building dashboards and alerting systems with Prometheus, Grafana, and ELK/EFK stacks.
  • Solid understanding of cloud networking, including VPC peering, load balancing, and DNS.
  • Understanding of zero-trust security principles in a containerized environment.
  • Experience with configuration management tools like Kustomize is preferred.
  • Experience with service mesh technologies such as Istio or Linkerd is preferred.

Interested in this position?

Apply directly on the company website

Apply Now

Similar Roles

Site Reliability Engineering Manager

RapidSOS 51-250 Diversified Telecommunication Services

RapidSOS is seeking an SRE Manager to lead its SRE Operations team and own the reliability of critical cloud infrastructure that supports real-time emergency response.

Argo CD AWS Datadog GitHub Actions Helm Kubernetes Python RabbitMQ Terraform
32 minutes ago

Site Reliability Engineer

Recorded Future 251-1K Professional Services

Recorded Future is hiring a Site Reliability Engineer to strengthen the reliability, scalability, and performance of its critical cloud systems in close partnership with engineering teams.

AWS Chef Elasticsearch ELK Stack Grafana Kafka Kibana Kubernetes Linux Logstash Microservices MongoDB OpenTelemetry Prometheus RabbitMQ Terraform
1 hour, 17 minutes ago

Senior Site Reliability Engineer (Remote - Brazil)

Loadsmart 251-1K Air Freight & Logistics

Loadsmart is hiring a Senior Site Reliability Engineer in Brazil to build and maintain its internal platform and ensure the reliability, safety, and operational excellence of critical engineering systems.

Ansible AWS Bash Chef CI/CD Docker Kubernetes PostgreSQL Python Terraform
1 hour, 17 minutes ago

Site Reliability Engineer

Alpaca 51-250 Capital Markets

Alpaca is hiring a Site Reliability Engineer to keep its brokerage platform reliable and operable across cloud, Kubernetes, observability, messaging, and database systems, with a strong focus on PostgreSQL reliability on the trading-critical path.

DNS GitOps Go Kafka Kubernetes Linux Load Balancing PostgreSQL Python RabbitMQ Secrets Management TLS
4 hours, 36 minutes ago

You're on a roll! Sign up now to keep applying.

Sign Up

Already have an account? Log in

Used by 14,729+ remote workers