Civica

Civica

Civica is a global leader in public sector software, providing digital solutions and managed services to transform customer experience and operational efficiency for over 3,000 organizations worldwide.

Internet Software & Services
1K-5K
Founded 2002

Description

  • Architect, implement, and continuously improve data center and cloud environments across AWS, Azure, and VMware.
  • Ensure platform reliability, performance, and security meet service-level agreements and scale with demand.
  • Build and evolve infrastructure as code and CI/CD pipelines to release features safely and efficiently.
  • Partner with teams to define, measure, and improve SLIs and SLOs.
  • Implement real-time observability and proactively identify risks before they affect users.
  • Own the on-call rota and lead incident response for production issues.
  • Coach teams through blameless post-mortems and drive continuous improvement after outages.
  • Collaborate with principal engineers, developers, product teams, and security teams on platform roadmaps and controls.
  • Mentor engineers through pairing, brown-bag sessions, and reliability best-practice evangelism.
  • Embed security controls into CI/CD, runtime environments, and disaster-recovery planning.

Requirements

  • Demonstrable experience in a production SRE, DevOps, or infrastructure role, ideally in a SaaS or large-scale web environment.
  • Expertise in at least one public cloud platform: AWS, Azure, or GCP.
  • Experience designing hybrid migrations from on-premises infrastructure to cloud.
  • Strong coding, scripting, and troubleshooting skills in Go, .NET, Java, Python, or similar.
  • Proven experience with infrastructure as code tools such as Terraform or CloudFormation.
  • Experience with container orchestration platforms such as Kubernetes, ECS, AKS, or OpenShift.
  • Experience with virtual machine orchestration, provisioning, and resiliency tools such as KubeVirt, Packer, or Ansible.
  • Deep understanding of monitoring, logging, and tracing tools such as Prometheus/Grafana, ELK/OpenSearch, or Jaeger.
  • Excellent communication skills and experience working in cross-functional teams.
  • Passion for building reusable, tested libraries and tooling.

Benefits

  • 25 days of annual leave plus bank holidays, with the option to buy up to 10 extra days.
  • Up to 3 additional days off for volunteering through the Days of Difference program.
  • 5% employer pension match.
  • Income protection covering up to 75% of salary for long-term illness.
  • Life assurance providing a 4x salary tax-free lump sum.
  • Critical illness cover of £25,000, extendable to dependents.
  • Private medical insurance, health cash plan, and dental insurance.
  • Electric vehicle and hybrid vehicle scheme.

Interested in this position?

Apply directly on the company website

Apply Now

Similar Roles

Site Reliability Engineer (Senior or Staff), Atlas

MongoDB 1K-5K Internet Software & Services

MongoDB is hiring a Senior Site Reliability Engineer for its Atlas team to help support, maintain, and grow a multi-cloud platform for customer-facing production workloads.

AWS Azure DNS GCP Go HTTP Linux Python Ruby TLS
22 minutes ago

Site Reliability Engineer (Senior or Staff), Storage Layer Services (SLS)

MongoDB 1K-5K Internet Software & Services

MongoDB’s Storage Layer Services team is hiring a Site Reliability Engineer to help re-architect the cloud storage layer for Atlas and ensure the reliability and operational safety of its distributed storage infrastructure.

AWS Azure DNS GCP Go Kubernetes Linux Python TCP/IP TLS
1 hour, 20 minutes ago

Manager, Software Engineering (Resilience Engineering)

Affirm 1K-5K Diversified Financial Services

Affirm is hiring an Engineering Manager to lead its Resilience Engineering team in building production load testing and chaos engineering capabilities that improve the safety and reliability of its production systems.

AWS Java Kotlin Kubernetes Python
3 hours, 35 minutes ago

Manager, Software Engineering (Resilience Engineering)

Affirm 1K-5K Diversified Financial Services

Affirm is seeking an Engineering Manager to lead its Resilience Engineering team, building production load testing and chaos engineering capabilities that improve the safety and reliability of production systems.

AWS Java Kotlin Kubernetes Microservices Python
15 hours, 20 minutes ago

You're on a roll! Sign up now to keep applying.

Sign Up

Already have an account? Log in

Used by 14,729+ remote workers