Senior SRE Engineer

5 days, 4 hours ago
Full-time
Senior
Software Development
Trustly

Trustly

Trustly specializes in developing and providing online payment solutions that leverage Open Banking technology to enhance payment processes, reduce costs, and streamline financial services for consumers, merchants, and banks.

Diversified Financial Services
251-1K
Founded 2008

Description

  • Architect, design, and implement strategies to ensure high availability, reliability, and fault tolerance of infrastructure and applications.
  • Lead incident response efforts, perform root cause analysis, implement preventative measures, and own post-incident follow-ups and remediation.
  • Monitor and observe production systems using automation tools to detect, triage, and resolve reliability issues.
  • Identify performance bottlenecks, conduct performance analysis, and optimize system and application performance.
  • Drive automation initiatives to remove manual toil by developing and maintaining tools, scripts, and frameworks for deployment, monitoring, and troubleshooting.
  • Generate regular reports on system reliability, uptime, and performance metrics and present findings, trends, and recommendations to management and stakeholders.
  • Collaborate with cross-functional teams to define SLIs/SLOs/error budgets, KPIs, and develop reporting frameworks to track system health and operational efficiency.
  • Support and maintain critical services running in AWS and on-premises, including system, security, and network monitoring and maintenance.

Requirements

  • Bachelor's degree in Computer Science or a related field.
  • Experience building SLIs, SLOs, and error budgets based on business rules.
  • IT project management experience.
  • Coding experience with Python, Java, Shell, Bash, or similar languages.
  • Experience supporting critical production services in the cloud (AWS) and on-premises environments.
  • Experience with network technologies and system, security, and network monitoring tools.
  • Detailed technical knowledge of databases and the Linux operating system, including standards and best practices for keeping services up and running.
  • Proactive approach to spotting problems, removing manual processes/toil using code, and fixing performance concerns programmatically.
  • Advanced English.
  • Ability to work remotely from Brazil (remote-first culture; position supports working from any city in Brazil).

Benefits

  • Bradesco health and dental plan for you and your dependents with no co-payment cost.
  • Life insurance with differentiated coverage.
  • Meal voucher and supermarket voucher.
  • Home office allowance and remote-first flexible hours (work from any city in Brazil).
  • Gympass access to physical activity spaces and online classes.
  • English program with online group classes and private teacher.
  • Welcome kit with Apple equipment (MacBook Pro, iPhone) and option to purchase equipment under internal criteria.
  • Annual discretionary bonus (annual premium) based on company KPIs and employee referral program rewards.

Interested in this position?

Apply directly on the company website

Apply Now

Similar Roles

Senior Site Reliability Engineer

Sezzle 251-1K Diversified Financial Services

Sezzle is hiring a Senior Site Reliability Engineer to own and improve the reliability, scalability, and automation of its U.S.-focused infrastructure and distributed systems while supporting a rapidly growing fintech platform.

AWS CI/CD Datadog Elasticsearch Git GitLab Go Grafana Kubernetes Microservices MySQL New Relic PostgreSQL Prometheus Python React React Native REST API SQL TypeScript
30 minutes ago

Site Reliability Engineering (SRE)

Riskified 251-1K Internet Software & Services

Riskified is hiring a Site Reliability Engineer to own the cloud infrastructure behind its real-time fraud and risk decisioning platform, ensuring scalability, reliability, and fast delivery at global transaction volumes.

Argo CD AWS CI/CD Cloudflare Go Helm Kubernetes Microservices Node.js
30 minutes ago

Site Reliability Engineer II, Data Platforms

OpenTable 1K-5K Consumer Services

OpenTable is hiring a remote India-based Site Reliability Engineer to support and improve the reliability, automation, and observability of its global database platforms.

Ansible Bash CI/CD Docker Elasticsearch Git GitHub GitHub Actions Go Grafana Kubernetes Linux MongoDB OpenSearch PagerDuty PostgreSQL Prometheus Puppet Python Redis SQL Server Unix
1 hour, 15 minutes ago

Site Reliability Engineer

TextNow 51-250 Wireless Telecommunication Services

TextNow is hiring a remote Site Reliability Engineer in Canada to own infrastructure, monitoring, logging, CI/CD, and reliability for the systems supporting its free phone service platform.

Ansible AWS CI/CD GitHub System Design Terraform
8 hours, 15 minutes ago

You're on a roll! Sign up now to keep applying.

Sign Up

Already have an account? Log in

Used by 14,729+ remote workers