Newsela

Newsela

Newsela is an Instructional Content Platform that supercharges reading engagement and learning in every subject. They provide engaging, relevant instructional content for teachers that both educates and inspires. With a focus on meaningful classroom le...

Diversified Consumer Services
251-1K
Founded 2013
$172M raised

Description

  • Participate in an on-call rotation to respond to incidents affecting Newsela.com availability and support developers during internal and external incidents.
  • Maintain and extend infrastructure automation using Terraform, GitHub Actions CI/CD, Prefect, and AWS services.
  • Build monitoring and alerting that detects symptoms before outages using Datadog, Sentry, and CloudWatch.
  • Identify repeatable manual tasks and automate them to reduce on-call toil.
  • Improve operational processes for deployments, releases, and migrations with fault tolerance in mind.
  • Design, build, and maintain core cloud infrastructure on AWS and GCP to support thousands of concurrent users.
  • Debug production issues across multiple services and layers of the stack.
  • Provide infrastructure and architectural planning support as an embedded partner to application development teams.
  • Plan and support the growth of Newsela’s infrastructure.
  • Complete root cause analyses, readiness reviews, and documentation such as architecture diagrams, process diagrams, and runbooks.

Requirements

  • Experience with infrastructure as code, especially Terraform and GitHub CI/CD for automation.
  • Experience containerizing environments with Docker and ECS.
  • Experience managing operating systems, storage, networking, and high-availability datastores such as MySQL, Postgres, Neo4j, and Redis.
  • Experience implementing monitoring and instrumentation in Datadog, Sentry, log management systems, and Slack/JIRA integrations.
  • Working knowledge of availability, reliability, scalability, and disaster recovery practices.
  • Ability to work in Shell, IaC, Python, and SQL.
  • Familiarity with agile methodologies and using epics and issues to drive projects.
  • Ability to organize personal and team workloads and contribute to OKR leadership.
  • Ability to self-organize and work asynchronously.
  • Experience leading scope and design discussions, contributing to documentation, and improving team practices through code reviews and incident handoffs.
  • Experience mentoring, handling conflict constructively, and maintaining strong relationships with other engineering teams.
  • Contract role; not eligible to participate in company-sponsored benefits.

Benefits

  • Fully remote work environment with a monthly tech stipend for work-from-home needs.
  • Comprehensive medical benefits with employer contributions to premiums and HSA accounts.
  • Additional wellness perks including gym reimbursement, pet insurance, and free access to the Calm app and Rocket Lawyer.
  • Inclusive family support including parental leave, fertility support, and adoption support.
  • 401(k) plan with employer match.
  • Flexible PTO, paid sick time off, company holidays, and winter break from December 24th to January 1st.
  • Annual learning and development allowance for training, classes, workshops, conferences, and educational materials.

Interested in this position?

Apply directly on the company website

Apply Now

Similar Roles

SRE / Platform Reliability Architect

NEORIS 5K-10K Internet Software & Services

EPAM NEORIS is seeking an SRE/Platform Reliability Architect to lead platform reliability and resiliency design, incident response, and cross-functional alignment for digital transformation initiatives.

CI/CD Grafana Kubernetes OpenTelemetry Prometheus Terraform
17 minutes ago

Principal Site Reliability Engineer

Zscaler 1K-5K Internet Software & Services

Zscaler is hiring a Principal Site Reliability Engineer to join its Infrastructure Services and Architecture team, owning cloud and infrastructure reliability for customer-facing systems in a hybrid or remote role.

Agile Ansible CI/CD Git Go HashiCorp Vault Kubernetes Linux OpenID Connect Python Terraform
1 hour, 17 minutes ago

Senior Site Reliability Engineer

OfficeSpace Software 251-1K Internet Software & Services

OfficeSpace Software is hiring a Senior Site Reliability Engineer to own the performance, reliability, and cost efficiency of its production platform at scale while helping modernize operations with AI-assisted reliability engineering.

Ansible Apache Argo CD CI/CD Datadog GitOps Grafana Kubernetes Linux MariaDB Microservices MySQL Nginx PostgreSQL Prometheus Puppet Python Redis Ruby Ruby on Rails Sidekiq Terraform
3 hours, 2 minutes ago

Manager, Software Engineering (Resilience Engineering)

Affirm 1K-5K Diversified Financial Services

Affirm is hiring an Engineering Manager to lead its Resilience Engineering team in building production load testing and chaos engineering capabilities that improve the safety and reliability of its production systems.

AWS Java Kotlin Kubernetes Python
3 hours, 39 minutes ago

You're on a roll! Sign up now to keep applying.

Sign Up

Already have an account? Log in

Used by 14,729+ remote workers