Stellar Cyber

Stellar Cyber

Stellar Cyber provides Next Gen SIEM Security, Network Detection, and Response platforms with AI-driven threat analysis, empowering lean security teams to secure environments effectively.

Professional Services
51-250
Founded 2017
$80M raised

Description

  • Administer and maintain container orchestration platforms and containerized workloads.
  • Monitor and troubleshoot production systems and participate in on-call rotations to ensure reliability.
  • Improve observability across systems and data platforms by enhancing monitoring, logging, and alerting.
  • Administer and optimize cloud-based environments across multiple providers.
  • Manage and support distributed data platforms and real-time processing systems.
  • Develop and maintain CI/CD pipelines for efficient and reliable deployments.
  • Own and implement Infrastructure as Code practices to improve consistency and scalability.
  • Automate and orchestrate infrastructure using programming and scripting languages.
  • Perform system administration and networking tasks for internal and external environments.
  • Collaborate with engineers and stakeholders across different time zones.

Requirements

  • 5+ years of experience in Site Reliability Engineering, DevOps, or Platform Engineering roles.
  • Proven success leading large-scale production systems in cloud environments such as AWS, GCP, Azure, or OCI.
  • Experience driving incident response, on-call best practices, and a reliability-focused culture.
  • Strong experience with production on-call operations and incident management.
  • Advanced proficiency in Kubernetes administration and troubleshooting.
  • Hands-on experience with observability tools such as Prometheus, Grafana, Loki, and Alertmanager.
  • Knowledge of chat-based operations interfaces and/or auto-remediation controllers using an AI agentic framework.
  • Understanding of AI agents for auto-triaging alerts, correlating signals, and suggesting root-cause hypotheses.
  • Experience operating data platforms such as Elasticsearch, MongoDB, Spark, Kafka, and Redis.
  • Proficiency with public cloud services such as AWS, Azure, GCP, or OCI.
  • Strong programming and automation skills in Python and Bash.
  • Deep understanding of Infrastructure as Code tools such as Terraform and Helm.
  • Experience with CI/CD pipelines and tools such as GitHub Actions, Bitbucket, and ArgoCD.
  • Strong technical background in distributed systems, databases, networking, and Linux administration.
  • Bachelor's degree in Computer Science, Engineering, or a related technical field.
  • Certifications in AWS, GCP, Observability, Linux, or Kubernetes are a plus.

Interested in this position?

Apply directly on the company website

Apply Now

Similar Roles

Senior Database Reliability Engineer

PointClickCare 1K-5K Health Care Providers & Services

PointClickCare is hiring a Senior Database Reliability Engineer to manage and improve the cloud database infrastructure behind its mission-critical SaaS platform.

Ansible AWS Azure C# Databricks GCP Git Grafana InfluxDB JIRA MySQL PostgreSQL PowerShell Python SQL SQL Server Terraform
19 minutes ago

Site Reliability Engineer

SwissBorg 51-250 Capital Markets

SwissBorg is hiring a Site Reliability Engineer to support and scale its cloud infrastructure and operations for a fast-growing crypto investment platform.

Ansible Argo CD AWS CI/CD DNS Git GitLab GitOps Grafana Kafka Kubernetes OpenSearch OpenTelemetry PostgreSQL Prometheus Terraform
34 minutes ago

Staff Platform Site Reliability Specialist (Observability & Kubernetes)

Everbridge 1K-5K Internet Software & Services

Everbridge is hiring a Staff Platform Site Reliability Specialist to own and evolve its enterprise observability platform and Kubernetes environment across a large-scale cloud-native infrastructure.

AWS GCP Grafana Kubernetes Terraform
34 minutes ago

LiveOps Engineer

Civica 1K-5K Internet Software & Services

Civica is seeking a LiveOps Engineer to help operate and improve its cloud and production environments that support critical public services for citizens worldwide.

Ansible AWS Azure Bash CI/CD Datadog DNS Docker Elasticsearch Git GitHub Actions Go Grafana Helm Jenkins Kubernetes Load Balancing PowerShell Prometheus Python Terraform
1 hour, 49 minutes ago

You're on a roll! Sign up now to keep applying.

Sign Up

Already have an account? Log in

Used by 14,729+ remote workers