Job Description

Role: Site Reliability Engineer (SRE)

Location/s: Atlanta, GA / Bellevue, WA / Frisco, TX / Overland Park, KS (Onsite from Day 1)

Job Type: Full Time

Required Skills:

Reliability Engineering, Kubernetes, Cloud Platform, Python Scripting

The opportunity:

Design, build, and maintain reliable, scalable, and secure cloud-based infrastructure (AWS, Azure, or GCP).
Develop and improve observability using monitoring, ing, logging, and tracing tools (e.g., Prometheus, Grafana, ELK, Datadog, etc.).
Automate repetitive tasks and infrastructure using Infrastructure-as-Code (Terraform, CloudFormation, Pulumi).
Create and maintain CI/CD pipelines (GitHub Actions, GitLab CI, Jenkins, ArgoCD, etc.) to support fast and safe delivery.
Lead incident response, root cause analysis, and postmortems to ensure high uptime and rapid recovery.
Optimize system performance, reliability, and cost-effectiveness through proactive monitoring and tuning.
Collaborate with software engineering teams to define SLAs/SLOs and improve service reliability.
Implement and maintain security best practices across environments (e.g., secrets management, IAM, firewalls, etc.).
Maintain disaster recovery plans, backups, and high-availability strategies.

Required:

5+ years of experience as an SRE, DevOps Engineer, or similar role.
Proficiency in scripting and automation (Bash, Python, Go, etc.).
Strong experience with containerization and orchestration (Docker, Kubernetes, Helm).
Solid understanding of Linux systems administration and networking fundamentals.
Experience with cloud platforms (AWS, Azure, or GCP).
Experience with IaC tools like Terraform or CloudFormation.
Familiarity with GitOps and modern deployment practices.
Hands-on experience with observability tools (e.g., Prometheus, Grafana, Datadog).
Strong troubleshooting and incident response skills.

Preferred:

Experience in a high-traffic, microservices-based architecture.
Exposure to service meshes (Istio, Linkerd).
Certifications (AWS Certified DevOps Engineer, CKA, etc.)
Experience with security automation and compliance (e.g., SOC2, ISO27001).

Note: Visa Independent candidates are preferred

Job Tags

Full time,

Similar Jobs

Amazing Athletes

Amazing Athletes - Preschool Multi-Sport and Fitness Coach Job at Amazing Athletes

Amazing Athletes is the premier sports-based enrichment program in the country. We are looking for energetic and motivated coaches to add to our team. This fun part-time position requires traveling to different schools and day care centers to teach the basic fundamentals...

Staffed4U

Help Desk Specialist Job at Staffed4U

...TS/SCI with Polygraph Employment Type: Full-Time An entry-level Help Desk Specialist is needed to provide Tier 1 IT support in a high... ...data for monthly status reports Required Skills & Experience: Ability to work rotating shifts (including nights, weekends...

Amazon.com Services LLC - A57

Data Scientist II - AMZ9084921 Job at Amazon.com Services LLC - A57

...AVAILABLE Employer: AMAZON.COM SERVICES LLC Offered Position: Data Scientist II Job Location: Seattle, Washington Job Number:... ...decision making throughout the business. Apply a range of data science techniques and tools combined with subject matter expertise to...

Addison Kenway.

Psychiatry Physician - Psychiatrist Job at Addison Kenway.

...Psychiatrist Brooklyn, New YorkJob#PR16712945 A long-standing community health provider located in Brooklyn is seeking a psychiatrist to support behavioral health services. In this role, the provider will conduct psychiatric evaluations, develop individualized treatment...

Axiom Software Solutions Limited

Site Reliability Engineer (SRE) Job at Axiom Software Solutions Limited

...Role: Site Reliability Engineer (SRE) Location: Miami FL Onsite Position Type: Contract Required Skills & Qualifications 9+ years of experience in Site Reliability Engineering, DevOps, or similar role. Strong experience with Linux/Unix systems...

Site Reliability Engineer (SRE) - Full Time Job at Saransh Inc, Atlanta, GA

cllVaWpFYWkvdFpDUDViOEh3eGdIQWtsSHc9PQ==