Toptal

Toptal

Toptal is a curated talent marketplace connecting freelance software developers, designers, finance experts, product managers, and project managers with businesses globally. With a focus on top 3% talent in software engineering, design, and finance, To...

Construction & Engineering
5K-10K
Founded 2010
$4M raised

Description

  • Design, deploy, and maintain core logging, database, networking, and monitoring infrastructure across hundreds of servers.
  • Build systems, automation, tooling, and workflows that support DevOps practices and help developer teams own more of the software lifecycle.
  • Implement monitoring, automated health checks, troubleshooting procedures, and maintenance documentation.
  • Collaborate with engineering teams to improve tools, systems, procedures, and data security.
  • Participate in daily scrum standups, pair programming, peer code reviews, and team collaboration in Slack and Zoom.
  • Design, develop, document, analyze, test, and modify computer or cloud-based systems and programs.
  • Support infrastructure design, architecture, implementation, and compatibility/performance improvements.
  • Participate in on-call rotation for infrastructure systems during business and after hours.
  • Investigate outages and performance issues, determine root causes, and coordinate resolution with other teams.
  • Plan and coordinate testing for changes, upgrades, patches, new releases, and new services.

Requirements

  • 5+ years of experience in Linux debugging, networking, routing, IP addressing, load balancing, and VPNs.
  • Experience managing infrastructure configuration and provisioning through code for large distributed systems on public cloud platforms, preferably AWS and GCP.
  • Hands-on experience with infrastructure-as-code tools such as Ansible and Terraform, or strong experience with Puppet or Chef.
  • Understanding of version control, with code maintained in Git.
  • Experience running RDBMS systems, especially PostgreSQL; transferable knowledge from MySQL, SQLite, and similar databases is preferred.
  • Hands-on experience with monitoring and alerting tools such as Graphite, Grafana, Prometheus, InfluxDB, and Sensu.
  • Strong troubleshooting skills for resolving complex problems through established processes.
  • Strong understanding of modern systems and service-oriented architecture.
  • Proficiency in scripting languages such as Python, Bash, or Ruby.
  • Experience with Docker and Docker Compose, including building optimized Dockerfiles, is an advantage.
  • Experience with Kubernetes production operations, troubleshooting, debugging, cluster provisioning, and management is an advantage.
  • Excellent written and verbal communication skills.
  • Ability to work in a fast-paced, rapidly growing company and manage a wide variety of challenges and deadlines.
  • Eagerness to help teammates, share knowledge, and learn from others.
  • Must be a world-class individual contributor and not primarily focused on managing others.
  • Resumes and communication must be submitted in English.
  • No visa sponsorship or visa assistance is provided.
  • Remote work is required.
  • Candidates should be based in South America, Central America, or Europe.

Interested in this position?

Apply directly on the company website

Apply Now

Similar Roles

Cloud / Platform Engineer (Remote)

Alex Staff Agency 11-50 Professional Services

A USA-based EdTech product company is seeking a Cloud/Platform Engineer to join its Systems Engineering team and help operate the cloud infrastructure behind a global digital learning platform used by millions of students and educators.

AWS Bash GCP Go Kubernetes Python Terraform
4 minutes ago

Infrastructure Software Engineer

Mechanical Orchard 11-50 Internet Software & Services

Mechanical Orchard is hiring a remote Infrastructure Software Engineer to help install and adapt its Imogen modernization platform for customer cloud environments while working on critical legacy mainframe systems.

Bash CI/CD COBOL Docker Go Helm Kubernetes Microservices Terraform
19 minutes ago

E01-L03 Cloud Security Specialist III (GCP/OCI)

TalentWerx 11-50 Professional Services

EXPANSIA is hiring a remote Cloud Security Specialist III to support secure cloud environments for U.S. Department of Defense and national security programs.

Ansible AWS Azure DevSecOps GCP
49 minutes ago

Private Cloud Services Subject Matter Expert

D-ploy 251-1K Internet Software & Services

D-ploy is seeking a Private Cloud Engineering Architect to provide operations consultation and technical leadership for private cloud and infrastructure services across Cisco UCS and VMware-based environments.

AWS Azure Cisco
49 minutes ago

You're on a roll! Sign up now to keep applying.

Sign Up

Already have an account? Log in

Used by 14,729+ remote workers