Sr. Site Reliability Engineer III (6448)

1 hour, 19 minutes ago
Full-time
Senior
Software Development
MetroStar

MetroStar

MetroStar builds innovative technology solutions designed to enhance and accelerate the missions of government agencies, leveraging a rich legacy of expertise in the digital age.

IT Services
251-1K
Founded 1999

Description

  • Design, deploy, and maintain mission-critical application workloads in virtualized or containerized environments such as VMware or Kubernetes.
  • Develop and sustain automated CI/CD pipelines, monitoring, and configuration management workflows across development, integration, staging, and production environments.
  • Provision, configure, and maintain developer environments and toolchains that support secure and efficient software delivery.
  • Identify friction across the software development lifecycle and implement solutions that improve the developer experience.
  • Establish and maintain customer trust through deep technical expertise and mission-focused problem solving.
  • Support operational observability and reliability for highly available production systems.
  • Participate in incident response, root cause analysis, and continuous improvement activities.

Requirements

  • Active Top Secret clearance or higher.
  • Certification meeting DoD 8140 requirements, such as Security+ or higher.
  • Bachelor’s degree in Computer Science or a related engineering field preferred; relevant experience may substitute.
  • 7+ years of experience in software development, systems engineering, or operations roles focused on availability, performance, and reliability.
  • Experience blending software engineering and systems administration practices to support highly available, scalable applications.
  • Experience designing and managing monitoring, alerting, and observability solutions to meet Service Level Objectives.
  • Experience with Ansible and Desired State Configuration.
  • Experience with GitLab CI/CD automation and Bash scripting.
  • Experience with Kubernetes, including container-native storage and object storage solutions such as MinIO, S3-compatible services, or PortWorx.
  • Experience with enterprise load-balancing solutions such as F5 or similar platforms.
  • Ability to contribute immediately with minimal ramp-up in a mission-critical operational environment.
  • Essential personnel designation with potential work during government shutdowns, emergencies, or other critical situations.

Benefits

  • Salary range of $185,000 to $230,000.
  • Eligible for performance-based bonuses and additional incentives based on individual and company performance.
  • Company-paid training and/or certifications.
  • Referral bonuses.
  • Health, dental, and vision insurance.
  • 401(k) retirement plan with company match.
  • Paid time off and holidays.
  • Parental leave, dependent care, flexible work arrangements, professional development opportunities, and employee assistance and wellness programs.

Interested in this position?

Apply directly on the company website

Apply Now

Similar Roles

Staff Site Reliability Engineer, Production Engineering

Dropbox 1K-5K Internet Software & Services

Dropbox is hiring a Site Reliability Engineer to shape company-wide reliability strategy for AI-assisted and agentic software development while improving stability, observability, incident response, and operational excellence at scale.

1 hour, 9 minutes ago

Senior Site Reliability Engineer

Honeycomb.io 51-250 Internet Software & Services

Honeycomb is hiring a Site Reliability Engineering professional to help scale backend systems, improve reliability, and support distributed engineering operations for a fast-growing observability platform.

AWS CI/CD Go Helm Kafka Kubernetes Terraform
1 hour, 19 minutes ago

Senior Production Engineer

Veeam Software 1K-5K Internet Software & Services

Veeam is hiring a Senior Production Engineer to design and operate reliable, scalable production systems for its Data Cloud platform and to lead improvements in incident response, automation, observability, and operational excellence.

Azure C# CI/CD Elasticsearch Go Grafana Java JavaScript OpenTelemetry Prometheus TypeScript
1 hour, 19 minutes ago

[Job - 29712] Senior Devops / SRE

CI&T 5K-10K Internet Software & Services

CI&T is hiring a Senior DevOps/SRE to support remote delivery of scalable .NET and Next.js products with a strong focus on CI/CD, infrastructure reliability, observability, and incident response.

AWS AWS CDK Azure C# CI/CD Datadog Docker Gatling GitHub Actions GitLab CI Grafana Jaeger K6 Kubernetes .NET Next.js OpenTelemetry Prometheus Pulumi Terraform TypeScript WAF
2 hours, 19 minutes ago

You're on a roll! Sign up now to keep applying.

Sign Up

Already have an account? Log in

Used by 14,729+ remote workers