Site Reliability Engineer (Top Secret Clearance)

8 hours, 59 minutes ago
Full-time
Junior
DevOps and Infrastructure
SpaceX

SpaceX

SpaceX designs, manufactures, and launches advanced rockets and spacecraft with the aim of revolutionizing space technology and enabling human life on other planets.

Aerospace & Defense
10K-50K
Founded 2002

Description

  • Develop automation to deploy and manage compute resources across on-premises and cloud environments.
  • Build, maintain, and scale on-premises hardware systems for GPU-accelerated machine learning workloads.
  • Deploy and manage core infrastructure services such as databases, monitoring, and storage.
  • Collaborate closely with software engineers to create scalable, operable, and maintainable products.
  • Own the full lifecycle of services from inception and design through deployment, operation, and refinement.
  • Provide development, testing, and operational support for the systems created by the team.

Requirements

  • Bachelor’s degree in computer science, information systems/IT, or an engineering discipline, or 2+ years of professional experience in software, DevOps, or site reliability engineering in lieu of a degree.
  • 1+ year of experience with Kubernetes.
  • 1+ year of experience with Linux operating systems.
  • Experience with Bash, Python, and/or other scripting languages.
  • Experience building, maintaining, and scaling on-premises and/or cloud systems.
  • Active Top Secret, Top Secret SCI, or DOE Level Q clearance is highly desired.
  • Experience hosting and advancing inferential model benchmarks is preferred.
  • Experience with systems administration, site reliability engineering, or DevOps engineering is preferred.
  • Experience with Python and Python-based development frameworks is preferred.
  • Experience with virtualization and hypervisor technologies is preferred.
  • Experience managing dozens or hundreds of servers automatically is preferred.
  • Knowledge of performance bottlenecks and performance improvement techniques is preferred.
  • Excellent communication skills and the ability to communicate with customers, peers, and management in formal and informal settings is preferred.
  • Ability to quickly learn new tools and frameworks is preferred.
  • Must be willing to work extended hours and weekends as needed.
  • Must be a U.S. citizen, national, lawful permanent resident, refugee, asylee, or otherwise eligible under ITAR requirements.

Benefits

  • Pay range of $145,000 to $175,000 for Level 3.
  • Potential 10% clearance differential, up to an additional $20,000 annually, once officially briefed into a classified program.
  • Long-term incentives in the form of company stock or long-term cash awards.
  • Potential discretionary bonuses.
  • Employee Stock Purchase Plan with the ability to buy additional stock at a discount.
  • Comprehensive medical, vision, and dental coverage.
  • 401(k) retirement plan, short- and long-term disability insurance, and life insurance.
  • Paid parental leave, approximately 3 weeks of paid vacation, and 10 or more paid holidays per year.

Interested in this position?

Apply directly on the company website

Apply Now

Similar Roles

Junior Site Reliability Engineer

Fable 11-50 Professional Services

Fable is hiring a Junior Site Reliability Engineer to support the reliability, performance, and scalability of the infrastructure behind its accessible digital products.

AWS Azure Bash CI/CD CloudFormation Datadog GCP Git GitHub Actions Grafana JavaScript Linux Prometheus Python Terraform Unix
8 hours, 59 minutes ago

Senior SRE - Platform (Managed Kubernetes Infrastructure)

Elastic 1K-5K Internet Software & Services

Elastic is hiring a Site Reliability Engineer on its Platform Engineering team to design and operate the multi-cloud platform that hosts Elastic Cloud services and supports rapid, reliable product delivery.

Docker Go InfluxDB Kubernetes Linux Prometheus Terraform
1 day, 8 hours ago

Site Reliability Engineer

Dropbox 1K-5K Internet Software & Services

Dropbox is hiring a Corporate Site Reliability Engineer to lead infrastructure reliability, observability, automation, and security for its IT Services environment.

Ansible AWS Bash Chef Datadog DHCP DNS Docker EC2 GitHub GitHub Actions GitOps Kubernetes Linux Python REST API Serverless Terraform Ubuntu WAF
1 day, 8 hours ago

Senior Observability Engineer

Ensono 1K-5K IT Services

Ensono is hiring an observability and monitoring engineer to operate and improve hybrid cloud monitoring platforms for enterprise clients, with the goal of delivering real-time visibility, reliable alerting, and compliant monitoring operations.

Ansible AWS Azure Bash Datadog GCP JavaScript Kubernetes Python Terraform
1 day, 8 hours ago

You're on a roll! Sign up now to keep applying.

Sign Up

Already have an account? Log in

Used by 14,729+ remote workers