Share this Job

Title:  Site Reliability Engineer


Greenqloud, IS Greenqloud, IS

Requisition ID:  50741
Job Summary

NetApp is developing a new and broad portfolio of SaaS solutions that enable our customers to harness the power of their data in new and interesting ways. In support of that mission, we are rapidly expanding our new Site Reliability Engineering (SRE) organization to run these new SaaS offerings. We are looking for smart individuals who get things done to join our team and help deliver these amazing capabilities to market.


Job Summary:

As a Site Reliability Engineer, you’ll manage a portfolio of customer-facing cloud services (SaaS/IaaS) ensuring overall availability, performance and security. This role is based in Reykjavik and reports directly to a Director of Cloud Operations. In this role you will work in a highly collaborative environment with NetApp and Google/Microsoft teams from all over the world. This position includes rotational on-call work due to the critical nature of the services we support.


Job Requirements:

  • Administration of cloud-based environments that support our SaaS / IaaS offerings that are implemented in our kubernetes-based architecture.
  • Automation of repetitive and error prone tasks and processes, using tools like Ansible, Python, PowerShell and other scripting languages. 
  • Monitor for any issues on the production environments, using tools like Prometheius, Observium, ElasticSearch and Splunk.  
  • Continuous measurement of availability, latency and system health using tools like Grafana, OpsRamp and others.
  • Ability to secure the environment from security threats deploying patches and least-privilege configurations. 
  • Ability to respond to incidents and drive change that prevents those incidents from recurring by using those opportunities to automate recovery. 
  • Use Atlassian Jira to track issues to resolution based on their priority.
  • Design and implement tools for the automated deployment of multiple environments like AWS, Google and Azure.


Key Characteristics:

  • You are generally curious and highly motivated with a passion for ensuring scalable, performant and highly-available solutions.
  • You are great at and love debugging and solving technical problems throughout a technology stack.
  • Have a mindset that if you must do the same thing twice, you’re going to automate it!
  • Strong interpersonal skills including verbal and written communication skills.
  • Familiarity with Linux and Windows operating systems.
  • The ability to write automation and understanding of a least one language such as: Bash, Python, PowerShell, Ansible, Go.
  • A basic understanding of public cloud vendors such as AWS, Azure, Google Cloud or others.
  • A Bachelor of Science Degree is preferred.