ProArch

ProArch

At ProArch, we help our clients accelerate growth and mitigate risk with IT services, cybersecurity services, application development, cloud computing, and data analytics. ProArch was founded on the belief that a future where change is ‘business as usu...

Internet Software & Services
251-1K
Founded 2006

Description

  • Monitor system performance and reliability to ensure uptime meets organizational SLAs.
  • Implement and maintain observability tools for proactive issue detection through metrics and logs.
  • Troubleshoot and resolve complex production issues across infrastructure components.
  • Collaborate with software engineering teams to design and implement scalable, fault-tolerant architectures.
  • Develop and maintain automation scripts for deployment, monitoring, and system management.
  • Participate in the on-call rotation to respond to incidents and perform root cause analysis.
  • Contribute to capacity planning and performance tuning for optimal resource utilization.
  • Document infrastructure, processes, and incident responses to support knowledge sharing.

Requirements

  • 8+ years of experience as a Site Reliability Engineer, DevOps Engineer, or in a related role.
  • Strong experience with cloud providers such as AWS, Azure, or GCP.
  • Proficiency in scripting languages such as Python, Bash, or Go.
  • Experience with container orchestration tools like Kubernetes.
  • Familiarity with CI/CD pipelines and tools such as Jenkins or GitLab CI.
  • Experience with Snowflake, including account administration expertise.
  • Solid understanding of networking and security principles.
  • Experience with monitoring and logging tools such as Prometheus, Grafana, or the ELK stack.
  • Excellent problem-solving skills and a proactive attitude.
  • Strong communication and teamwork skills with an emphasis on collaboration.
  • Experience with Infrastructure as Code tools such as Terraform or CloudFormation is preferred.
  • Knowledge of service mesh architectures and modern microservices patterns is preferred.
  • Background in software development and familiarity with Agile methodologies is preferred.

Interested in this position?

Apply directly on the company website

Apply Now

Similar Roles

Site Reliability Engineer

Alpaca 51-250 Capital Markets

Alpaca is hiring a Site Reliability Engineer to keep its brokerage platform reliable and operable across cloud, Kubernetes, observability, messaging, and database systems, with a strong focus on PostgreSQL reliability on the trading-critical path.

DNS GitOps Go Kafka Kubernetes Linux Load Balancing PostgreSQL Python RabbitMQ Secrets Management TLS
47 minutes ago

Site Reliability Engineer

Kaseya 1K-5K IT Services

Kaseya is hiring a Site Reliability Engineer to own the reliability, automation, and production stability of its AWS-based services used by thousands of MSPs worldwide.

Ansible AWS Chef CloudFormation Datadog DevSecOps Elasticsearch Kibana Kubernetes MySQL PostgreSQL Puppet Secrets Management Serverless Terraform
4 hours, 46 minutes ago

SRE - DevOps Engineer - Argentina

Coderio 51-250 Internet Software & Services

Coderio is hiring a remote DevOps/SRE Engineer in Argentina to ensure the stability, scalability, and efficient operation of the infrastructure that supports its global digital solutions.

Argo CD CI/CD Flux GitHub Actions GitOps Helm Jenkins Kubernetes OpenShift Terraform
8 hours, 26 minutes ago

Senior Site Reliability Engineer

Cribl 251-1K IT Services

Cribl is hiring a Senior Site Reliability Engineer in Poland to help build and operate the telemetry infrastructure and observability platform that supports its cloud products and enterprise customers.

Ansible AWS Azure CI/CD Grafana JavaScript Kibana Linux New Relic Node.js PagerDuty Prometheus Splunk Terraform TypeScript
15 hours, 59 minutes ago

You're on a roll! Sign up now to keep applying.

Sign Up

Already have an account? Log in

Used by 14,729+ remote workers