START: ASAP
JOB TYPE: LONG TERM, FULL TIME REMOTE JOB
SENIOR SITE RELIABILITY ENGINEER (US-BASED, LATAM CANDIDATES WELCOME)
We seek a Senior SRE to optimize, automate, and maintain cloud infrastructure for high availability and scalability.
RESPONSIBILITIES:
•Manage AWS/GCP/Azure cloud environments for performance, security, and reliability.
• Automate deployments with Terraform, Ansible, CloudFormation, or Pulumi.
• Optimize CI/CD pipelines using Jenkins, GitHub Actions, or GitLab CI/CD.
•Implement monitoring, logging, and alerting (Datadog, Prometheus, ELK).
• Ensure security best practices, IAM policies, and compliance.
•Troubleshoot incidents, lead root cause analysis (RCA), and optimize system performance.
REQUIREMENTS:
• 5+ years in SRE/DevOps, with expertise in AWS, GCP, or Azure.
• Strong Kubernetes (EKS/GKE/AKS), Docker, and IaC (Terraform, Pulumi) skills.
•Proficiency in Python, Go, or Bash for automation.
• Experience with incident management, cost optimization, and security.
•Fluent in English, with excellent problem-solving skills.