Smarsh

Smarsh

Smarsh provides cloud-based archiving and compliance solutions that help organizations in regulated and litigious industries manage the risks associated with their electronic communications across more than 80 channels.

IT Services
251-1K
Founded 2001
$44M raised

Description

  • Own Kubernetes platform operations, including cluster health, workload deployments, scaling, and incident response.
  • Design, implement, and operate infrastructure automation using Ansible, Terraform, and GitOps workflows.
  • Lead migration projects that move on-premises workloads toward cloud-native platform services.
  • Build and maintain CI/CD pipelines for infrastructure and application delivery.
  • Improve observability through dashboards, alert tuning, and SLO/SLA definition using Datadog, Splunk, and ELK.
  • Participate in the on-call rotation and respond to P1/P2 incidents.
  • Support security and compliance needs, including patch management, access controls, and audit readiness.
  • Contribute to runbooks and operational documentation for owned systems.
  • Collaborate with adjacent platform teams on the build and adoption of a shared platform.

Requirements

  • 4–7 years of experience in platform engineering, SRE, or infrastructure engineering roles.
  • Strong hands-on experience with Kubernetes, including cluster operations, Helm, and workload troubleshooting.
  • Proficiency with infrastructure-as-code tooling, specifically Ansible and/or Terraform in production environments.
  • Strong Linux systems administration skills, preferably Ubuntu.
  • Experience with GitOps workflows and CI/CD pipelines at scale.
  • Experience with VMware vSphere in a production environment.
  • Demonstrated ability to self-direct and drive projects to completion with minimal oversight.
  • Strong communication skills with cross-functional stakeholders.
  • Experience with Datadog, Splunk, or ELK for dashboards, monitors, and log management is preferred.
  • Familiarity with compliance-sensitive or regulated industry infrastructure is preferred.
  • Experience with ArgoCD, Flux, or similar GitOps continuous delivery tooling is preferred.
  • Familiarity with Jenkins or Concourse for CI/CD pipeline management is preferred.
  • Familiarity with VMware Kubernetes Service (VKS) or other VMware-native Kubernetes platforms is preferred.
  • Python scripting for automation and tooling is preferred.
  • Prior experience in an on-call rotation with a defined SLA structure is preferred.
  • Experience with cloud infrastructure, especially AWS, is beneficial as cloud responsibilities expand.

Benefits

  • Base salary range of $120,000 to $160,000 per year.
  • Bonus programs may be available and will be discussed during the recruiting process.
  • Local cost of living is considered in offer determination.
  • Remote work arrangement.
  • Opportunity to work on a major platform modernization effort.
  • Chance to take on expanding cloud infrastructure responsibilities.
  • Work in a collaborative, global organization that values learning and development.

Interested in this position?

Apply directly on the company website

Apply Now

Similar Roles

Ingeniero de Plataforma

2Brains is seeking a Platform Engineer to design and maintain secure, scalable infrastructure and developer enablement services for Latin American client teams in a remote full-time consulting environment.

Ansible Bash CI/CD GCP GitHub Actions GitLab CI Jenkins Kubernetes Python Terraform
13 hours, 28 minutes ago

Software Engineer II - Engineering Productivity - Platform

Elastic 1K-5K Internet Software & Services

Elastic is hiring for a Platform Engineering Productivity role focused on building the infrastructure and processes that help teams develop, test, release, and deliver software reliably at scale.

Ansible Argo CD AWS Bash Buildkite CI/CD Docker GCP Go Java Jenkins Kubernetes Python Scala Terraform
14 hours, 2 minutes ago

MLOps & Agentic Platform Engineer (AI Infrastructure)

Hyphen Connect 1-10 staffing & recruiting

MLOps & Agentic Platform Engineer at a company building and operating scalable agent infrastructure, model lifecycle tooling, and production observability systems.

Docker Kubernetes Microservices MLflow MLOps Terraform
17 hours, 14 minutes ago

MLOps & Agentic Platform Engineer (AI Infrastructure)

Hyphen Connect 1-10 staffing & recruiting

MLOps & Agentic Platform Engineer at a company building and operating agent-based machine learning systems, focused on production deployment, training workflows, experimentation, and platform observability.

Docker Kubernetes Microservices MLflow MLOps Terraform
17 hours, 53 minutes ago

You're on a roll! Sign up now to keep applying.

Sign Up

Already have an account? Log in

Used by 14,729+ remote workers