Ensono

Ensono

Ensono provides comprehensive hybrid IT solutions and governance, enabling businesses to navigate complexity and modernize their technology infrastructure, from cloud services to mainframe systems, tailored to each client's unique journey.

IT Services
1K-5K
Founded 1969

Description

  • Engineer and operate a scalable monitoring and observability platform for Ensono’s Hybrid Cloud clients.
  • Plan and execute the strategic roadmap for observability and monitoring tools in alignment with business and client requirements.
  • Define monitoring best practices, including proactive alerting, anomaly detection, and performance analytics.
  • Operate and optimize end-to-end monitoring solutions for real-time visibility into network, distributed systems, and applications.
  • Establish automated alerting thresholds based on Service Level Objectives (SLOs) and Service Level Agreements (SLAs).
  • Establish monitoring audit standards for conformance and compliance across standard and custom monitors.
  • Serve as the point of escalation for day-to-day monitoring-related incidents.
  • Automate monitoring configurations and telemetry collection using scripting and Infrastructure as Code tools such as Ansible and Terraform.

Requirements

  • 7+ years of experience in observability or monitoring engineering operational roles.
  • 7+ years of hands-on experience with ITSM platforms such as ServiceNow and monitoring tools such as BMC, Data Dog, Entuity, or similar tools.
  • Strong proficiency in Python, Bash, and JavaScript for automation and scripting.
  • Experience with Infrastructure as Code tools such as Ansible and Terraform for observability tool deployment.
  • Strong analytical and problem-solving skills for diagnosing complex issues.
  • Effective communication and leadership skills, especially for training and cross-functional collaboration.
  • Ability to think holistically about business-impacting processes and continuously refine them.
  • Ability to thrive in an independent and collaborative fast-paced environment while managing priorities effectively.
  • Bachelor’s degree in a related field.
  • Master’s degree in an information technology-related field (preferred).
  • Proficiency in cloud platforms such as AWS, Azure, or GCP, and Kubernetes deployment and monitoring (preferred).
  • Advanced ITIL certification or training, including ITIL v3 or v4 (preferred).
  • Experience integrating AI/ML into ITSM practices (preferred).

Interested in this position?

Apply directly on the company website

Apply Now

Similar Roles

Senior SRE - Platform (Managed Kubernetes Infrastructure)

Elastic 1K-5K Internet Software & Services

Elastic is hiring a Site Reliability Engineer on its Platform Engineering team to design and operate the multi-cloud platform that hosts Elastic Cloud services and supports rapid, reliable product delivery.

Docker Go InfluxDB Kubernetes Linux Prometheus Terraform
8 hours, 8 minutes ago

Site Reliability Engineer

Dropbox 1K-5K Internet Software & Services

Dropbox is hiring a Corporate Site Reliability Engineer to lead infrastructure reliability, observability, automation, and security for its IT Services environment.

Ansible AWS Bash Chef Datadog DHCP DNS Docker EC2 GitHub GitHub Actions GitOps Kubernetes Linux Python REST API Serverless Terraform Ubuntu WAF
8 hours, 22 minutes ago

Senior SRE Engineer (Observability Focus)

Capital.com 251-1K Capital Markets

Senior SRE Engineer at a leading trading platform, owning the company’s observability practice end to end for a hybrid AWS and on-prem production environment.

Ansible Argo CD AWS Bash Elasticsearch Fluentd GitOps Grafana Helm Java JavaScript Kafka Kubernetes OpenSearch OpenTelemetry Prometheus Python Terraform TypeScript
8 hours, 52 minutes ago

Sr. Site Reliability Engineer

SpaceX 10K-50K Aerospace & Defense

SpaceX is hiring a Sr. Site Reliability Engineer for the Starshield program to build and operate highly reliable infrastructure and automation for government-focused satellite software systems.

Ansible Bash CI/CD Kubernetes Linux Python TCP/IP Terraform
8 hours, 52 minutes ago

You're on a roll! Sign up now to keep applying.

Sign Up

Already have an account? Log in

Used by 14,729+ remote workers