Censys

Censys

Censys provides security teams with a comprehensive and accurate mapping of the internet, enabling them to effectively defend against attack surfaces and proactively hunt for threats.

IT Services
51-250
Founded 2017
$53M raised

Description

  • Design, build, and maintain tooling to support applications running on Kubernetes and Google Cloud Platform.
  • Support developer workflows and the SDLC by writing supporting application code, automation, and platform integrations to enable developers to create, deploy, and manage services end-to-end.
  • Work with development teams to improve service resilience and reliability, and promote best practices and golden-path standardization.
  • Ensure smooth operations of production environments and collaborate with developers to debug complex incidents and maintain primary site uptime.
  • Create and maintain monitoring and observability for the four golden signals (latency, traffic, errors, and load) across applications.
  • Develop self-service platform components such as service catalogs, repository tooling, and documentation to accelerate developer velocity.
  • Participate in a shared on-call rotation for infrastructure and ensure on-call readiness and incident response for platform services.
  • Accelerate developer productivity by listening to internal customers, iterating on feedback, and delivering tools that reduce friction in CI/CD and deployment processes.

Requirements

  • 5+ years of experience in an SRE role or similar.
  • Experience deploying, managing, and debugging applications in Kubernetes; familiarity with Helm and Crossplane.
  • Experience building, securing, and managing container images.
  • Experience with cloud environments and services such as CloudSQL, Pub/Sub, Memorystore, and other GCP services.
  • Familiarity with Infrastructure-as-Code tools such as Terraform or Crossplane.
  • Experience with monitoring and observability tooling for the four golden signals, including Prometheus, Grafana, and OpenTelemetry.
  • Familiarity with monorepo and trunk-based development models and CI/CD systems such as GitHub Actions and ArgoCD, with a desire to achieve Continuous Deployment.
  • Ability to communicate with and support developers with empathy, and to promote automation and self-service to improve developer velocity.
  • Preferred: experience with gRPC microservice architectures and service mesh technologies (e.g., Istio) for observability and multi-cluster routing.
  • Preferred: ability to read and modify application code (majority in Go; some Python and Scala), familiarity with application security tooling (dependency scanning, static analysis, linting), and comfort with Linux-based environments.

Benefits

  • Fully remote position within the United States.
  • Salary ranges: for high cost-of-living areas (Seattle, San Francisco Bay Area, New York City) listed between $166,000 USD and $1203,000 USD plus bonus eligibility and equity; for other U.S. locations $145,000 USD - $190,000 USD plus bonus eligibility and equity.
  • Bonus eligibility and equity participation.
  • Benefits effective on day one including 401(k) match, health, vision, and dental coverage.
  • Opportunity to work on critical infrastructure for internet intelligence at a company serving governments and large enterprises.

Interested in this position?

Apply directly on the company website

Apply Now

Similar Roles

Staff SRE Engineer

Stellar Cyber 51-250 Professional Services

Stellar Cyber is seeking a Staff Site Reliability Engineer to improve the reliability, scalability, and operational efficiency of its cloud-native production platforms supporting cybersecurity services.

Apache Spark Argo CD AWS Azure Bash Bitbucket CI/CD Elasticsearch GCP GitHub Actions Grafana Helm Kafka Kubernetes Linux MongoDB Prometheus Python Redis Terraform
21 minutes ago

Staff SRE Engineer

Stellar Cyber 51-250 Professional Services

Stellar Cyber is seeking a Staff Site Reliability Engineer to improve the reliability, scalability, and operational efficiency of its cloud-based cybersecurity platform and production systems.

Apache Spark Argo CD AWS Azure Bash Bitbucket Elasticsearch GCP GitHub Actions Grafana Helm Kafka Kubernetes MongoDB Prometheus Python Redis Terraform
51 minutes ago

Manager, Site Reliability Engineering I

Filevine 251-1K Specialized Consumer Services

Filevine is hiring a Manager of Site Reliability Engineering I to lead reliability and platform project execution for its Legal AI platform in close partnership with product and development teams.

AWS Kubernetes Terraform
1 hour, 36 minutes ago

Site Reliability Engineer

DEUNA 51-250 Diversified Financial Services

DEUNA is hiring a Mid Site Reliability Engineer to help ensure the reliability, scalability, and performance of its AWS-based payments platform through observability, automation, and SRE practices.

AWS Go Grafana OpenTelemetry Prometheus
1 hour, 36 minutes ago

You're on a roll! Sign up now to keep applying.

Sign Up

Already have an account? Log in

Used by 14,729+ remote workers