Coalfire

Coalfire

Coalfire is a cybersecurity advisor that helps organizations avert threats, reduce risk, and turn security into a competitive advantage, fueling their success.

Internet Software & Services
251-1K
Founded 2001
$9M raised

Description

  • Support a collaborative engineering team across cloud infrastructure administration, site reliability engineering, security operations, and vulnerability management for multiple clients.
  • Coordinate with client product teams, engineering teams, and other stakeholders to monitor and maintain secure, resilient cloud-hosted infrastructure to SLAs in production and non-production environments.
  • Design, implement, and maintain automated orchestration, configuration management, and Infrastructure-as-Code solutions.
  • Create, peer review, and manage orchestration, configuration management, and IaC codebases, including version control within client environments.
  • Implement and upgrade client environments using CI/CD infrastructure code and communicate environment requirements to development teams.
  • Work across AWS, Azure, and GCP to configure, tune, troubleshoot, and manage cloud services, cost, security, and compliance.
  • Monitor and resolve site stability and performance issues related to functionality and availability.
  • Provide 24x7x365 support through client ticketing systems and participate in on-call rotations as needed.
  • Support incident response and disaster recovery documentation, testing, validation, and exercises.
  • Maintain cloud architecture diagrams, standard operating procedures, operational runbooks, technical documentation, and troubleshooting guides.

Requirements

  • BS degree or above in an Information Technology-related field, or an equivalent combination of education and experience.
  • 2+ years of experience in 24x7x365 production operations.
  • Fundamental understanding of networking and networking troubleshooting.
  • 2+ years of experience installing, managing, and troubleshooting Linux and/or Windows Server operating systems in a production environment.
  • 2+ years of experience supporting cloud operations and automation in AWS, Azure, or GCP, with aligned certifications expected.
  • 2+ years of experience with Infrastructure-as-Code and orchestration/automation tools such as Terraform and Ansible.
  • Experience with IaaS platform capabilities and services, with cloud certifications expected.
  • Experience using ticketing tools such as Jira and ServiceNow.
  • Experience using environmental analytics tools such as Splunk and Elastic Stack for querying, monitoring, and alerting.
  • Experience in at least one primary scripting language such as Bash, Python, or PowerShell.
  • Excellent communication, organizational, and problem-solving skills in a dynamic environment.
  • Effective documentation skills, including technical diagrams and written descriptions.
  • Ability to work as part of a team with a professional attitude and demeanor.
  • Previous experience in a consulting role in dynamic, fast-paced environments is preferred.
  • Previous experience supporting a 24x7x365 highly available environment for a SaaS vendor is preferred.
  • Experience supporting security or infrastructure incident handling and investigation, or system scenario re-creation is preferred.
  • Experience with container orchestration solutions such as Kubernetes, Docker, EKS, or ECS is preferred.
  • Experience working within an automated CI/CD pipeline for release development, testing, remediation, and deployment is preferred.
  • Cloud-based networking experience with tools or platforms such as Palo Alto or Cisco ASAv is preferred.
  • Familiarity with frameworks such as FedRAMP, FISMA, SOC, ISO, HIPAA, HITRUST, or PCI is preferred.
  • Familiarity with configuration baseline standards such as CIS Benchmarks and DISA STIG is preferred.
  • Knowledge of encryption technologies such as SSL, encryption, and PKI is preferred.
  • Experience with diagramming tools such as Visio or Lucidchart is preferred.
  • Application development experience for cloud-based systems is preferred.

Benefits

  • Flexible work model with the ability to choose when and where you work, including remote options.
  • Paid parental leave.
  • Flexible time off.
  • Certification and training reimbursement.
  • Digital mental health and wellbeing support membership.
  • Comprehensive insurance options.
  • Opportunities to join employee resource groups and participate in in-person and virtual events.
  • Annual incentive, commission, and/or recognition program eligibility.

Interested in this position?

Apply directly on the company website

Apply Now

Similar Roles

Site Reliability Engineering Manager

RapidSOS 51-250 Diversified Telecommunication Services

RapidSOS is seeking an SRE Manager to lead its SRE Operations team and own the reliability of critical cloud infrastructure that supports real-time emergency response.

Argo CD AWS Datadog GitHub Actions Helm Kubernetes Python RabbitMQ Terraform
1 hour, 32 minutes ago

Site Reliability Engineer

Recorded Future 251-1K Professional Services

Recorded Future is hiring a Site Reliability Engineer to strengthen the reliability, scalability, and performance of its critical cloud systems in close partnership with engineering teams.

AWS Chef Elasticsearch ELK Stack Grafana Kafka Kibana Kubernetes Linux Logstash Microservices MongoDB OpenTelemetry Prometheus RabbitMQ Terraform
2 hours, 17 minutes ago

Senior Site Reliability Engineer (Remote - Brazil)

Loadsmart 251-1K Air Freight & Logistics

Loadsmart is hiring a Senior Site Reliability Engineer in Brazil to build and maintain its internal platform and ensure the reliability, safety, and operational excellence of critical engineering systems.

Ansible AWS Bash Chef CI/CD Docker Kubernetes PostgreSQL Python Terraform
2 hours, 17 minutes ago

Site Reliability Engineer

Alpaca 51-250 Capital Markets

Alpaca is hiring a Site Reliability Engineer to keep its brokerage platform reliable and operable across cloud, Kubernetes, observability, messaging, and database systems, with a strong focus on PostgreSQL reliability on the trading-critical path.

DNS GitOps Go Kafka Kubernetes Linux Load Balancing PostgreSQL Python RabbitMQ Secrets Management TLS
5 hours, 37 minutes ago

You're on a roll! Sign up now to keep applying.

Sign Up

Already have an account? Log in

Used by 14,729+ remote workers