DriveWealth

DriveWealth

DriveWealth is a pioneering technology company that provides solutions to transform the investing landscape. With award-winning APIs, they empower over 100 partners worldwide to offer fractional equities trading and embedded investing experiences. Thei...

Capital Markets
251-1K
Founded 2012
$551M raised

Description

  • Design and develop internal tools and SRE platforms to reduce toil and improve developer velocity.
  • Architect and maintain modular infrastructure as code using Terraform.
  • Manage GitOps workflows using ArgoCD.
  • Implement observability standards using OpenTelemetry and the Grafana stack.
  • Define and manage SLIs, SLOs, and error budgets.
  • Review software architecture and Kubernetes metrics to support high availability, capacity planning, and cost optimization across AWS regions.
  • Lead incident response and perform complex root-cause analysis.
  • Champion a blameless post-mortem culture.
  • Partner with engineering teams to drive adoption of new tools, security standards, and reliability best practices.
  • Support 24/7 global operations through critical on-call responsibilities.

Requirements

  • Proficient in Linux administration with a deep understanding of the TCP/IP stack, OSI model, DNS, and network troubleshooting.
  • Experience working in highly regulated financial environments or with FIX/API connectivity.
  • Hands-on experience managing production-grade Kubernetes clusters, including RBAC, autoscaling, Helm, and multi-cluster patterns.
  • Strong grasp of AWS core services, security, and high-availability patterns.
  • Proficiency with boto3 and AWS CLI for automation.
  • Experience building secure, automated CI/CD pipelines and operating GitOps workflows with ArgoCD.
  • Strong scripting and development skills in Python or Golang, along with Bash and Ansible.
  • Experience with secrets management, vulnerability scanning, and securing the software supply chain.
  • Familiarity with using LLMs, Public MCPs, or Bedrock Agent Core to enhance SRE workflows.
  • Experience managing Kafka, MQ, SQS, or orchestration tools like Airflow and Rundeck.
  • Must be authorized to work for any employer in the U.S.; DriveWealth cannot sponsor or take over sponsorship of an employment visa.
  • Not all U.S. states are eligible for hire for this remote role.

Benefits

  • Base salary range of $150,000 to $170,000 USD for remote candidates in most U.S. states.
  • Eligible for bonus and equity.
  • 401(k) match.
  • Full insurance coverage.
  • Wellness reimbursement.
  • Company-provided phone.
  • Personal development allowance.
  • Generous PTO, observed holidays, and extended leave.

Interested in this position?

Apply directly on the company website

Apply Now

Similar Roles

Senior Database Reliability Engineer

Rithum Internet Software & Services

Rithum is seeking a Senior Database Reliability Engineer to manage and improve the availability, reliability, observability, and security of database systems across a large hybrid infrastructure.

AWS CI/CD DynamoDB Elasticsearch MongoDB MySQL PostgreSQL PowerShell Python Redis SQL Server
41 minutes ago

Senior Site Reliability Engineer

Algolia 251-1K Internet Software & Services

Algolia is hiring a Site Reliability Engineering team member to help operate and improve the availability, reliability, scalability, and cost efficiency of its Search products at internet scale.

AWS Azure Chef CircleCI Datadog GCP GitHub Actions Go Kubernetes Linux Python Ruby Terraform
41 minutes ago

Senior Site Reliability Engineer (SRE)

Fable 11-50 Professional Services

Fable Global is seeking a Senior Site Reliability Engineer to help ensure the reliability, scalability, and cost-efficient operation of the infrastructure behind its accessible digital products and AI-enabled capabilities.

AWS Azure CI/CD CloudFormation Datadog GCP Go Grafana Java Node.js Prometheus Python Terraform
59 minutes ago

Senior Site Reliability Engineer

UJET 251-1K Professional Services

UJET is hiring a Senior Site Reliability Engineer to help establish and scale its SRE function for a cloud-based customer experience platform.

AWS Azure GCP Go Java Kubernetes Python Terraform
2 hours, 56 minutes ago

You're on a roll! Sign up now to keep applying.

Sign Up

Already have an account? Log in

Used by 14,729+ remote workers