LivePerson

LivePerson

LivePerson is a global leader in Conversational AI, offering real-time intelligent customer engagement solutions through their platform, LiveEngage, for over 18,000 clients worldwide.

Internet Software & Services
1K-5K
Founded 1995

Description

  • Collaborate with developers, QA, and product teams during sprint planning to understand release plans, dependencies, and infrastructure needs.
  • Participate in the application release cycle to ensure deployments are automated, consistent, and reliable.
  • Manage and operate Kubernetes clusters in Google Kubernetes Engine (GKE) and Amazon Elastic Kubernetes Service (EKS).
  • Develop and maintain Terraform modules for provisioning and configuring cloud infrastructure across GCP and AWS.
  • Standardize service deployments using Helm for templating and versioned releases.
  • Build and improve observability using Prometheus, Grafana, and Datadog to monitor platform and application performance.
  • Design, implement, and maintain GitLab CI/CD pipelines for build, test, and deployment automation.
  • Develop scripts and tooling in Python, Go, or Shell to reduce manual work and improve efficiency.
  • Participate in a 24/7 on-call rotation to detect, mitigate, and resolve incidents quickly.
  • Perform root cause analysis and contribute to post-incident reviews to prevent recurrence.
  • Identify reliability and scalability gaps early and partner with teams to address systemic risks.

Requirements

  • 5-8 years of experience as a Site Reliability Engineer, Platform Engineer, or DevOps Engineer.
  • Hands-on experience managing Kubernetes clusters in GKE and EKS on GCP and AWS.
  • Strong knowledge of Terraform, Helm, and GitLab CI/CD pipelines.
  • Proficiency in Python, Go, or Shell scripting for automation and tooling.
  • Experience implementing and managing observability stacks such as Prometheus, Grafana, and Datadog.
  • Deep understanding of Linux systems, cloud networking, and container orchestration concepts.
  • Experience working in Agile/Scrum environments and partnering closely with developers.
  • Excellent analytical skills with a proactive attitude and the ability to question assumptions and escalate risks early.
  • Experience with ArgoCD or Flux is preferred.
  • Familiarity with service mesh tools such as Istio or Linkerd, or with API gateways, is preferred.
  • Knowledge of cloud cost optimization, autoscaling, or security best practices is preferred.
  • Experience with incident management tools such as PagerDuty or ServiceNow is preferred.

Benefits

  • Flexible working arrangements, including remote work in India.
  • Competitive compensation.
  • 15 days of PTO plus casual leave and sick leave.
  • 8 lakhs family floater insurance coverage.
  • Personal accident and life insurance coverage worth 3x gross annual salary.
  • Career growth opportunities, including certifications and mentorship.
  • A collaborative, global team culture that values ownership, learning, and continuous improvement.

Interested in this position?

Apply directly on the company website

Apply Now

Similar Roles

Sr. Production Engineer, Solutions Engineering

Pinterest 5K-10K Internet Software & Services

Pinterest is hiring a Senior Production Engineer on Solutions Engineering to design AI-driven reliability and automation systems that improve the operation of large-scale distributed infrastructure serving hundreds of millions of users.

Ansible AWS Azure Chef Docker Envoy GCP Go Hadoop Kafka Kubernetes Linux MySQL Puppet Python Terraform Unix
4 hours, 40 minutes ago

Senior Network Site Reliability Engineer

Miro 1K-5K Internet Software & Services

Miro is hiring a Senior Network Site Reliability Engineer to strengthen the reliability, availability, and scalability of its AWS-based production infrastructure.

Agile AWS Azure Bash CI/CD DNS EC2 GCP GitHub GitLab Kubernetes Linux Python TCP/IP Terraform
4 hours, 55 minutes ago

Sênior Site Reliability Engineer - Network

Harford County Public Library 51-250 Diversified Consumer Services

Stone Tech, da Stone Co., busca um Senior Site Reliability Engineer - Network para liderar projetos críticos de infraestrutura de redes e evoluir a arquitetura global de conectividade do grupo.

Ansible API Gateway AWS Azure Cisco Datadog Fortinet GCP Kong Palo Alto Prometheus SIEM Splunk Terraform Zabbix
5 hours, 10 minutes ago

Senior Site Reliability Engineer (SRE)

Swile 251-1K Professional Services

Swile is hiring a Senior Site Reliability Engineer to help design and operate the shared infrastructure platform supporting its growing products and international expansion in Brazil.

Android AWS Datadog GitHub Go iOS Java Kafka Kotlin Kubernetes Node.js PostgreSQL Python React Redis Ruby Ruby on Rails Snowflake Swift Terraform TypeScript
5 hours, 25 minutes ago

You're on a roll! Sign up now to keep applying.

Sign Up

Already have an account? Log in

Used by 14,729+ remote workers