Senior Cloud Application Support Engineer (Remote - LATAM)

3 weeks ago
Contract
Senior
DevOps and Infrastructure
Atmosera

Atmosera

Atmosera is a trusted global cloud partner offering Azure managed cloud services with a focus on security and compliance for critical business applications worldwide.

IT Services
51-250
Founded 1995

Description

  • Perform real-time monitoring and incident dispositioning for critical client applications using Dynatrace and Azure Insights.
  • Correlate metrics, traces, and logs to conduct root cause analysis and identify performance bottlenecks in distributed environments.
  • Lead triage of complex alerting environments to reduce noise and ensure high-priority incidents are handled effectively.
  • Analyze metrics and daily reports to detect early signs of instability and prevent service disruptions.
  • Evaluate runbooks and establish new standards for operating procedures, governance, and client environment management.
  • Serve as the primary technical point of contact for P1 incidents and coordinate communication across technical and business stakeholders.
  • Automate manual reporting processes to improve operational efficiency and reporting accuracy.
  • Enforce SRE best practices and SLA compliance, including guidance on incident handling and problem record creation.
  • Mentor junior team members on complex procedures and APM telemetry interpretation.
  • Collaborate on product strategy and best practices to improve the performance and stability of client environments.

Requirements

  • Bachelor’s degree in computer science or a related technical field, or equivalent professional experience.
  • 5+ years of technical experience in managed service providers or cloud hosting environments, with a senior systems administration background.
  • Bilingual proficiency is required.
  • Expert-level proficiency in Dynatrace and Azure Insights, including advanced configuration and environment optimization.
  • Advanced technical expertise in correlating metrics, traces, and logs for root cause analysis.
  • Deep understanding of SRE principles and experience managing critical P1 incidents under strict SLAs.
  • Strong leadership and communication skills for handling P1/P2 tickets and stakeholder coordination.
  • Experience evaluating support documentation and establishing governance and operating procedures.
  • Experience automating manual reporting processes and translating telemetry into actionable business insights.
  • Microsoft Azure certifications are required within 90 days of employment, based on current certifications and skill level.
  • Advanced certifications in Dynatrace or other APM platforms are highly preferred.
  • Technical certifications in Azure, Windows, O365, SQL, Linux, VMware, Cisco, Palo Alto, AWS, GCP, Terraform, Dynatrace, or DevOps are a plus.

Benefits

  • Remote work within LATAM.
  • Contract position.
  • Opportunity to work with a Microsoft Partner with multiple specializations and a strong cloud/AI/security focus.

Interested in this position?

Apply directly on the company website

Apply Now

Similar Roles

Site Reliability Engineer (Senior or Staff), Atlas

MongoDB 1K-5K Internet Software & Services

MongoDB is hiring a Senior Site Reliability Engineer for its Atlas team to help support, maintain, and grow a multi-cloud platform for customer-facing production workloads.

AWS Azure DNS GCP Go HTTP Linux Python Ruby TLS
2 hours, 2 minutes ago

Manager, Software Engineering (Resilience Engineering)

Affirm 1K-5K Diversified Financial Services

Affirm is seeking an Engineering Manager to lead its Resilience Engineering team, building production load testing and chaos engineering capabilities that improve the safety and reliability of production systems.

AWS Java Kotlin Kubernetes Microservices Python
2 hours, 11 minutes ago

Site Reliability Engineer (Senior or Staff), Storage Layer Services (SLS)

MongoDB 1K-5K Internet Software & Services

MongoDB’s Storage Layer Services team is hiring a Site Reliability Engineer to help re-architect the cloud storage layer for Atlas and ensure the reliability and operational safety of its distributed storage infrastructure.

AWS Azure DNS GCP Go Kubernetes Linux Python TCP/IP TLS
3 hours ago

Technical Support Specialist - US Remote

PerfectServe 251-1K Internet Software & Services

PerfectServe is hiring a Support Specialist for its 24/7/365 healthcare communications support team to help physicians, nurses, and administrators resolve application issues and manage scheduling and user workflows.

3 hours, 35 minutes ago

You're on a roll! Sign up now to keep applying.

Sign Up

Already have an account? Log in

Used by 14,729+ remote workers