Senior Site Reliability Engineer - AWS

2 weeks, 2 days ago
Full-time
Senior
DevOps and Infrastructure
Filevine

Filevine

Filevine is a top legal tech company revolutionizing legal work with AI-powered case management software, empowering law firms to streamline operations and enhance client services.

Specialized Consumer Services
251-1K
Founded 2015
$226M raised

Description

  • Provide leadership, mentoring, and sound judgment as the reliability engineering lead on the team.
  • Design and maintain autonomous systems for building, deploying, testing, and operating Filevine products.
  • Serve as the authoritative voice of reliability across the full software development lifecycle.
  • Monitor, aggregate, dashboard, and alert on software and infrastructure events to ensure visibility and rapid response.
  • Continuously improve CI/CD pipelines, automation scripts, playbooks, and tools to streamline operations and reduce resolution time.
  • Identify and resolve gaps in system availability, performance, and security while strengthening the overall security posture.
  • Document processes, architecture, procedures, and best practices to support team effectiveness.
  • Research, adopt, or build reliable tools that improve engineer productivity.
  • Collaborate with team members and stakeholders, mentor junior engineers, and participate in a 24/7 on-call rotation for production support and emergency response.

Requirements

  • 8+ years of hands-on technical experience in software engineering, infrastructure, or operations roles, including at least 4 years dedicated to Site Reliability Engineering.
  • Strong curiosity, self-motivation, and a continuous learning mindset with proactive enthusiasm for improving systems and processes.
  • Strong proficiency in Python, Bash, PowerShell, and other common SRE scripting and tooling technologies.
  • Expert-level experience designing, building, and maintaining autonomous systems for build, deployment, testing, monitoring, and operations.
  • Hands-on experience with AWS services such as EC2, Kubernetes/EKS, CloudWatch, Lambda, S3, and IAM.
  • Proficiency in core SRE skills including monitoring and alerting, incident response, capacity planning, performance optimization, CI/CD enhancement, and reliability best practices.
  • Bachelor’s degree in Computer Science, Information Systems, or a related field, or equivalent certifications such as AWS or Google Cloud Professional certifications, or substantial comparable direct work experience.
  • Proven track record of independently driving reliability improvements, reducing toil through automation, and supporting highly available, scalable production systems in a fast-paced environment.

Benefits

  • $160,000 - $190,000 base salary.
  • Eligible for a paid time off policy.
  • Comprehensive benefits package.
  • Medical, dental, and vision insurance for full-time employees.
  • Maternity and paternity leave for full-time employees.
  • Short- and long-term disability coverage.
  • Opportunity to learn from a dedicated leadership team.
  • Top-of-the-line company swag.

Interested in this position?

Apply directly on the company website

Apply Now

Similar Roles

Site Reliability Engineer (Remote)

Libertex Group 251-1K Capital Markets

Libertex Group is hiring an SRE Engineer to support and improve the reliability, performance, and availability of its large-scale production systems for its online trading platform.

Ansible Apache Airflow AWS Azure Bash CDN CI/CD DNS Docker GCP GitLab Grafana HTTP Jenkins Kubernetes PowerShell Prometheus Python SQL SQL Server
7 hours, 48 minutes ago

Software Engineer II - Inline Mailflow

Abnormal AI Internet Software & Services

Abnormal AI is hiring a Software Engineer for the Inline Mailflow team to build next-generation SMTP relay infrastructure for outbound email security and long-term secure email gateway displacement.

Apache Spark AWS Django DNS Docker Go Kubernetes Prometheus Python
9 hours, 47 minutes ago

Infrastructure Reliability Engineer

Anduril Industries 1K-5K Aerospace & Defense

Anduril Industries is hiring an engineer to own the infrastructure and operations for core developer tools used across its engineering organization, with growing responsibility for on-prem reliability work.

AWS Azure Bash CI/CD CircleCI Datadog Docker GCP Go Grafana Kubernetes Prometheus Python Terraform
15 hours, 3 minutes ago

Senior DevOps Engineer/Site Reliability Engineer

Stellar Cyber 51-250 Professional Services

A global cybersecurity company is hiring a Senior DevOps / Site Reliability Engineer to build, operate, and scale reliable cloud-native infrastructure and distributed data platforms for mission-critical production environments.

Apache Spark Argo CD AWS Azure Bash CI/CD Docker Elasticsearch GCP GitHub Actions GitOps Go Grafana Helm Kafka Kubernetes Linux MongoDB Prometheus Python Redis Terraform
22 hours, 18 minutes ago

You're on a roll! Sign up now to keep applying.

Sign Up

Already have an account? Log in

Used by 14,729+ remote workers