Anduril Industries

Anduril Industries

Anduril Industries is an American defense technology firm that specializes in developing advanced autonomous systems for integrated awareness and security across land, sea, and air, utilizing its proprietary Lattice platform to enhance intelligence, su...

Aerospace & Defense
1K-5K
Founded 2017
$2200M raised

Description

  • Manage and expand on-premises developer servers, Hardware-in-the-Loop systems, and other on-site compute resources.
  • Design, implement, and maintain highly available, fault-tolerant, and resilient autonomous systems.
  • Identify and eliminate performance bottlenecks to ensure low-latency, high-throughput, real-time operations.
  • Develop monitoring, logging, tracing, and alerting solutions that provide visibility into system health at scale.
  • Automate operational tasks including provisioning, deployment, testing, and recovery.
  • Scale services and infrastructure to support evolving mission demands, including distributed systems and edge deployments.
  • Work with security teams to integrate best practices into operational processes and infrastructure.
  • Create documentation, runbooks, and playbooks for operational procedures.
  • Integrate open-source, commercial, and internal tooling to improve software delivery.
  • Collaborate with Developer Platform, Networking, Security, and autonomy software teams in a fast-paced, multidisciplinary environment.

Requirements

  • Bachelor of Science degree in Computer Science, Engineering, or a related field, or equivalent work experience.
  • 5+ years of experience in Site Reliability Engineering, DevOps, or a similar role focused on security for mission-critical applications.
  • Strong proficiency in at least one modern programming language such as Python or Go.
  • Experience with automation tools such as Ansible, Puppet, or Terraform.
  • Deep expertise with Linux operating systems and strong command-line skills.
  • Knowledge of secure coding practices and experience implementing security controls in cloud and on-premise environments.
  • Solid understanding of networking fundamentals including TCP/IP, DNS, HTTP, and load balancing.
  • Proficiency with Docker and Kubernetes.
  • Strong analytical, problem-solving, and debugging skills.
  • Excellent communication skills and ability to work effectively in cross-functional teams.
  • Must be a U.S. Person due to access to U.S. export-controlled information or facilities.
  • Active U.S. Security Clearance.
  • Experience with edge computing, mesh networks, or highly distributed autonomous systems (preferred).
  • Experience with embedded Linux systems development and associated tools (preferred).
  • Experience troubleshooting and analyzing remotely deployed software systems (preferred).
  • Familiarity with monitoring and logging tools such as auditd, journald, selinux, or Splunk (preferred).
  • Prior experience in defense, aerospace, robotics, or other mission-critical domains (preferred).
  • Extensive experience with cloud platforms such as AWS, Azure, or GCP (preferred).

Benefits

  • US salary range of $166,000 to $220,000.
  • Highly competitive equity grants are included in the majority of full-time offers.
  • Comprehensive, competitive benefits package available at little to no cost to employees.
  • Support for health, recovery, and future needs.
  • Full-time employee benefits with top-tier coverage.

Interested in this position?

Apply directly on the company website

Apply Now

Similar Roles

Senior Site Reliability Engineer (SRE)

The Investigo Group Professional Services

The Investigo Group is hiring a Senior Site Reliability Engineer to operate and mature its production Kubernetes and OpenShift platforms across secure on-premises and hybrid environments.

Ansible Argo CD CI/CD Flux GitHub Actions GitOps Go Grafana Helm Juniper Kubernetes Linux Load Balancing Machine Learning OpenID Connect OpenShift OpenTelemetry Palo Alto Prometheus Python SAML Shell Scripting Terraform
49 minutes ago

Sustaining Engineering Lead

Actian 251-1K IT Services

Actian is hiring a remote Sustaining Engineering Lead in Europe to own end-to-end escalation handling for critical platform issues on its data intelligence platform.

CI/CD GitHub JIRA
2 hours, 12 minutes ago

Senior Site Reliability Engineer

Blink Health 251-1K Health Care Providers & Services

Blink Health is hiring a senior site reliability and platform engineering leader to improve the reliability, observability, and scalability of its healthcare technology infrastructure supporting prescription access products.

Agile Ansible AWS Azure Bash CloudFormation DNS GCP Go Helm Kubernetes Linux Load Balancing Microservices Pulumi Python React Secrets Management TCP/IP Terraform
3 hours, 30 minutes ago

Senior Cloud Resilience Architect

Blink Health 251-1K Health Care Providers & Services

Blink Health is hiring a disaster recovery and resilience architecture leader to strengthen the reliability of its healthcare technology platforms and critical patient-facing systems.

Ansible AWS Azure CloudFormation DNS GCP Kubernetes Load Balancing Pulumi Terraform
6 hours, 59 minutes ago

You're on a roll! Sign up now to keep applying.

Sign Up

Already have an account? Log in

Used by 14,729+ remote workers