Site Reliability Engineer (SRE) (AWS + Kubernetes + Python) - US only

255 Days Ago SCR (USA)
Remote Job Job View : 326 Job Apply : 0
Description

Job Description: Site Reliability Engineer (SRE) (AWS + Kubernetes + Python)

START: ASAP

Duration: Long-term, full-time

Summary:

We are seeking a Site Reliability Engineer (SRE) with expertise in AWS, Kubernetes, and Python to ensure the reliability, scalability, and performance of mission-critical applications. The ideal candidate will focus on automation, observability, and incident response, working closely with development and operations teams to improve system reliability and efficiency.

 

Compensation:

Full-time (W2): $130K – $160K/year + benefits

Contract: $100–$110/hour

 

Responsibilities:

• Build and maintain scalable, highly available infrastructure on AWS

•Automate infrastructure provisioning using Terraform, Ansible, or CloudFormation

•Monitor system performance and troubleshoot incidents using Prometheus, Grafana, and ELK Stack

•Optimize Kubernetes clusters (EKS, GKE, AKS) for reliability and performance

•Develop and maintain CI/CD pipelines for seamless deployments

• Implement disaster recovery, failover strategies, and high availability solutions

•Ensure observability, logging, and tracing across distributed systems

• Collaborate with developers to design self-healing and fault-tolerant architectures

•Conduct post-mortems and root cause analysis for production incidents

 

Qualifications:

7+ years of experience in SRE, DevOps, or cloud infrastructure engineering

•Strong knowledge of AWS services (EC2, Lambda, S3, RDS, IAM, VPC, etc.)

• Experience with Kubernetes, Helm, and container orchestration

•Proficiency in Python, Bash, or Go for automation

•Familiarity with monitoring and logging tools (Datadog, Prometheus, New Relic)

• Experience implementing scaling, load balancing, and failover strategies

•Strong problem-solving skills and ability to work in a fast-paced environment

• Knowledge of security best practices, IAM, and cloud compliance

 

Compensation:

Full-time (W2): $130K – $160K/year + benefits

Contract: $100–$110/hour

 

Application ends in 12-02-2035

Please publish modules in offcanvas position.