The Home Depot logo

Senior Software Engineer - Reliability Engineer (Remote)

The Home Depot

Atlanta, GA
Full Time
Senior
29 days ago

Job Description

About the Role

The Senior Reliability Engineer is responsible for ensuring the reliability, availability, and performance of our systems and applications. As a Senior Reliability Engineer, you will work closely with a team of engineers to build and maintain reliable infrastructure and systems. You will also assist in tool selection, configuration, security, resilience, performance tuning, and production monitoring. Senior Reliability Engineers contribute to foundational infrastructure elements and system-related documentation. You will play a key role in the reliability team and are expected to mentor and support junior engineers.

Key Responsibilities

  • Develops, tests, deploys, and maintains software, with a clear understanding of the value the software is to provide.
  • Takes on new opportunities and tough challenges with a sense of urgency, high energy and enthusiasm.
  • Consistently achieves results, even under tough circumstances.
  • Develops test suites (functional, destructive, etc) to enable success, rapid deployment of code to production.
  • Takes a broad view when approaching issues; using a global lens.
  • Learns through successful and failed experiment when tackling new problems.
  • Actively seeks ways to grow and be challenged using both formal and informal development channels.
  • Collaborates with other team members in agile processes.
  • Creates new and better ways for the organization to be successful.
  • Works with the Product Team to ensure user stories are valuable, developer ready, easy to understand and testable.
  • Delivers multi-mode communications that convey a clear understanding of the unique needs of different audiences.
  • Adapts approach and demeanor in real time to match the shifting demands of different situations.
  • Relates openly and comfortably with diverse groups of people.
  • Helps grow junior engineers by providing guidance on modern software development frameworks, and leading technical discussions.

Requirements

  • Must be eighteen years of age or older.
  • Must be legally permitted to work in the United States.
  • 3 or more years of relevant work experience.
  • Experience with infrastructure automation tools such as Terraform and Ansible.
  • Extensive experience managing Google Cloud Platform projects and services including infrastructure, Compute, Developer Tools, Security and Identity.
  • Experience with monitoring and observability tools like Prometheus, Grafana, and OpenTelemetry.
  • Familiarity with both Unix and Windows operating systems.
  • Experience with security frameworks for user and services authorization and authentication.
  • Experience in creating and executing unit, functional, destructive, and performance tests.
  • Experience with modern debugging and root cause analysis techniques.
  • Experience with version control systems.
  • Experience in designing systems for High Availability, Disaster Recovery, Performance, Efficiency, and Security.
  • Operational support experience with a focus on system reliability.
  • Strong communication and collaboration skills with experience writing documentation, providing peer tutelage, providing consultative services, and presenting technical solutions and training to both technical and non-technical audiences.

Nice to Have

  • Experience with infrastructure automation tools such as Terraform and Ansible.
  • Extensive experience managing Google Cloud Platform projects and services including infrastructure, Compute, Developer Tools, Security and Identity.
  • Experience with monitoring and observability tools like Prometheus, Grafana, and OpenTelemetry.
  • Familiarity with both Unix and Windows operating systems.
  • Experience with security frameworks for user and services authorization and authentication.
  • Experience in creating and executing unit, functional, destructive, and performance tests.
  • Experience with modern debugging and root cause analysis techniques.
  • Experience with version control systems.
  • Experience in designing systems for High Availability, Disaster Recovery, Performance, Efficiency, and Security.
  • Operational support experience with a focus on system reliability.
  • Ability to share knowledge across engineering functions.
  • Strong communication and collaboration skills with experience writing documentation, providing peer tutelage, providing consultative services, and presenting technical solutions and training to both technical and non-technical audiences.

Qualifications

  • The knowledge, skills and abilities typically acquired through the completion of a bachelor's degree program or equivalent degree in a field of study related to the job.

Working at The Home Depot

The description does not explicitly detail the company's values or work environment, but emphasizes collaboration, innovation, adaptability, and continuous learning.

Apply Now

Job Details

Posted AtJun 28, 2025
Job CategoryDevOps
SalaryCompetitive salary
Job TypeFull Time
Work ModeRemote
ExperienceSenior

Job Skills

AI Insights

Key skills identified from this job posting

Sign upto access all insights for this job

About The Home Depot

Website

homedepot.com

Company Size

10000+ employees

Location

Atlanta, GA

Industry

Home Centers

Get job alerts

Set up personalized alerts for your job search and get tailored job digests for close matches