Omilia

Omilia is a global leader in Conversational AI, offering AI-based self-service solutions for enhanced customer care fulfillment and success.

IT Services

Information Technology

251-1K (360)

Founded 2002

$20M raised

32 open positions

Links

View All Jobs

Senior Site Reliability Engineer

1 month, 1 week ago

Australia

Full-time

Senior

Site Reliability Engineer (SRE)

DevOps and Infrastructure

Agile Ansible AWS Bash CentOS Go Grafana Kubernetes MySQL PostgreSQL Prometheus Python Redis RHEL TCP/IP Terraform

Apply Now

Omilia

Omilia is a global leader in Conversational AI, offering AI-based self-service solutions for enhanced customer care fulfillment and success.

IT Services

251-1K

Founded 2002

$20M raised

View All Jobs 32

Description

Ensure platform reliability and availability across production and pre-production environments through proactive monitoring, alerting, and automation.
Serve as first response for incidents and contribute to problem management and root cause analysis.
Support development teams in building a reliability-focused culture within the development lifecycle.
Develop troubleshooting documentation and production support materials.
Collaborate with engineering teams to create optimized runbooks, operational documentation, and automation for operational tasks.
Work with development and cloud engineering teams to embed reliability and performance into the software delivery lifecycle.
Design, implement, and evolve observability solutions using metrics, logs, traces, and dashboards.
Use tools such as Prometheus, Grafana, and ELK to improve monitoring and visibility.
Participate in on-call rotations and continuously improve alert quality and response processes.
Champion continuous improvement in reliability and performance across teams.

Requirements

Bachelor's degree or MS in Engineering, or equivalent experience.
Experience operating at least one container orchestration cluster, such as Kubernetes or Docker Swarm.
Experience developing or maintaining software for production services at scale.
Experience with ELK.
Experience with AWS.
Experience with the Grafana/Prometheus stack.
Strong scripting skills in Bash, Python, or Go.
Excellent communication skills.
Ability to think creatively, anticipate challenges, and question existing technologies and procedures.
Comfort working in agile/lean methods and iterating collaboratively.
Strong team-player mindset and ability to work across product, experience design, engineering, and other functions.
Telephony knowledge, including SIP and VoIP, is a plus.
Experience in Linux administration, including RedHat, CentOS, or AL, is a plus.
Working knowledge of configuration management tools such as Terraform and Ansible is a plus.
Experience with TCP/IP and general networking concepts is a plus.
RDBMS knowledge, such as MySQL or Postgres, is a plus.
NoSQL knowledge, such as Redis, is a plus.

Benefits

Fixed compensation.
Long-term employment with vacation days.
Professional development support, including courses and training.
Opportunity to work on cutting-edge products with global impact in the service industry.
A collaborative, fun-to-work-with team.
Apple gear provided.
Equal opportunity employer with a diverse and inclusive workplace.

Interested in this position?

Apply directly on the company website

Apply Now

Similar Roles

Senior Site Reliability Engineer

Counterpart Health 51-200 hospital & health care

Counterpart Health is hiring a Senior Site Reliability and Infrastructure Engineer to support and evolve the technology platform behind its primary care tool and maintain reliable infrastructure for domestic and international workloads.

United States Full-time Senior Site Reliability Engineer (SRE)

$160k-$208k

AWS Azure CI/CD Containerd DNS Docker GCP Go gRPC Helm Kubernetes Linux Load Balancing Prometheus Python Shell Scripting TCP/IP

16 hours, 8 minutes ago

Apply

16 hours, 8 minutes ago

Senior Test Platform & Reliability Engineer - Star Trek Fleet Command

Scopely 1K-5K Internet Software & Services

Scopely is hiring a Senior Test Platform & Reliability Engineer in Ireland to build validation, reliability, and developer enablement platforms for Star Trek Fleet Command’s large-scale live-service backend systems.

Ireland Full-time Senior SDET (Software Development Engineer in Test) Site Reliability Engineer (SRE)

AWS Bash CI/CD Docker GitLab Go Python Terraform

16 hours, 23 minutes ago

Apply

16 hours, 23 minutes ago

Senior Software Engineer - Databases, SRE | Canada | Remote

Grafana 1K-5K IT Services

Grafana Labs is hiring a Senior Software Engineer for its remote SRE team to improve reliability and operability of Grafana Cloud database services for high-SLA customers across AWS, GCP, and Azure.

Canada Full-time Senior Site Reliability Engineer (SRE) Software Engineer

$108k-$130k

AWS Azure GCP Go Helm Java Kubernetes Linux Microservices Python Terraform

1 day, 15 hours ago

Apply

1 day, 15 hours ago

Senior Site Reliability Engineer

Semios 51-250 Food Products

Semios Group is hiring a Senior Site Reliability Engineer to help scale, secure, and improve the reliability of its global agricultural technology platform.

Canada Full-time Senior Site Reliability Engineer (SRE)

$140k-$160k

AWS Azure Bash Buildkite CI/CD Datadog Docker Envoy GCP Git GitHub GitHub Actions GitLab Go Jenkins Kubernetes Linux NATS New Relic Prometheus Python Ruby Splunk Terraform

1 day, 16 hours ago

Apply

1 day, 16 hours ago

Omilia

Tags

Links

Senior Site Reliability Engineer

Omilia

Description

Requirements

Benefits

Similar Roles

Senior Site Reliability Engineer

Senior Test Platform & Reliability Engineer - Star Trek Fleet Command

Senior Software Engineer - Databases, SRE | Canada | Remote

Senior Site Reliability Engineer

You're on a roll! Sign up now to keep applying.