LivePerson

LivePerson

LivePerson is a global leader in Conversational AI, offering real-time intelligent customer engagement solutions through their platform, LiveEngage, for over 18,000 clients worldwide.

Internet Software & Services
1K-5K
Founded 1995

Description

  • Collaborate with developers, QA, and product teams during sprint planning to understand release plans, dependencies, and infrastructure needs.
  • Participate in the application release cycle to ensure deployments are automated, consistent, and reliable.
  • Manage and operate Kubernetes clusters in Google Kubernetes Engine (GKE) and Amazon Elastic Kubernetes Service (EKS).
  • Develop and maintain Terraform modules for provisioning and configuring cloud infrastructure across GCP and AWS.
  • Standardize service deployments using Helm for templating and versioned releases.
  • Build and improve observability using Prometheus, Grafana, and Datadog to monitor platform and application performance.
  • Design, implement, and maintain GitLab CI/CD pipelines for build, test, and deployment automation.
  • Develop scripts and tooling in Python, Go, or Shell to reduce manual work and improve efficiency.
  • Participate in a 24/7 on-call rotation to detect, mitigate, and resolve incidents quickly.
  • Perform root cause analysis and contribute to post-incident reviews to prevent recurrence.
  • Identify reliability and scalability gaps early and partner with teams to address systemic risks.

Requirements

  • 5-8 years of experience as a Site Reliability Engineer, Platform Engineer, or DevOps Engineer.
  • Hands-on experience managing Kubernetes clusters in GKE and EKS on GCP and AWS.
  • Strong knowledge of Terraform, Helm, and GitLab CI/CD pipelines.
  • Proficiency in Python, Go, or Shell scripting for automation and tooling.
  • Experience implementing and managing observability stacks such as Prometheus, Grafana, and Datadog.
  • Deep understanding of Linux systems, cloud networking, and container orchestration concepts.
  • Experience working in Agile/Scrum environments and partnering closely with developers.
  • Excellent analytical skills with a proactive attitude and the ability to question assumptions and escalate risks early.
  • Experience with ArgoCD or Flux is preferred.
  • Familiarity with service mesh tools such as Istio or Linkerd, or with API gateways, is preferred.
  • Knowledge of cloud cost optimization, autoscaling, or security best practices is preferred.
  • Experience with incident management tools such as PagerDuty or ServiceNow is preferred.

Benefits

  • Flexible working arrangements, including remote work in India.
  • Competitive compensation.
  • 15 days of PTO plus casual leave and sick leave.
  • 8 lakhs family floater insurance coverage.
  • Personal accident and life insurance coverage worth 3x gross annual salary.
  • Career growth opportunities, including certifications and mentorship.
  • A collaborative, global team culture that values ownership, learning, and continuous improvement.

Interested in this position?

Apply directly on the company website

Apply Now

Similar Roles

Senior Applications Support Specialist

Ensono 1K-5K IT Services

Application Reliability Lead at an enterprise in a regulated environment, responsible for restoring service during incidents and improving the resilience, stability, and operational readiness of critical applications.

Grafana Java .NET PowerShell Prometheus Python Splunk SQL
16 minutes ago

Remote in Brazil - Senior DevOps & Cloud/SRE

Stack Builders 51-250 Internet Software & Services

Stack Builders is hiring a Senior DevOps & Cloud/SRE Engineer to design and optimize secure, scalable infrastructure for client projects across the U.S., U.K., and Australia.

Ansible AWS Azure Bash CI/CD CircleCI CloudFormation Docker EC2 GCP GitHub Actions GitLab CI GitOps Go Jenkins Kubernetes Linux MongoDB MySQL PostgreSQL Pulumi Python Redis Secrets Management Terraform
21 minutes ago

Reliability Engineer, Energy Storage

Redwood Materials 251-1K Industrial Conglomerates

Redwood Materials is hiring a Reliability Engineer, Energy Storage to help define and validate the reliability of new hardware products for its battery and energy storage systems.

Python SEM
1 hour, 1 minute ago

Senior Database Reliability Engineer

Rithum Internet Software & Services

Rithum is seeking a Senior Database Reliability Engineer to manage and improve the reliability, availability, and observability of its large-scale hybrid database environment supporting e-commerce operations.

AWS CI/CD DynamoDB Elasticsearch MongoDB MySQL PostgreSQL PowerShell Python Redis SQL Server
1 hour, 1 minute ago

You're on a roll! Sign up now to keep applying.

Sign Up

Already have an account? Log in

Used by 14,729+ remote workers