Share this Job

Title:  Site Reliability Engineering Lead

Location:  USA - Massachusetts - Waltham
Requisition ID:  868

Job Summary

OnCommand Insight Team in NetApp is one of the fastest moving teams. We deliver enterprise software to manage infrastructure and applications. NetApp’s OnCommand Insight is one of our most successful products with significant sales to large global enterprises. As we build our new Cloud offering we are looking for a person to build and lead our Site Reliability Engineering Team. If you have experience overseeing complex software services and keeping services reliable and available you are needed here.


  • Build architecture and operations tools to run a SaaS product.
  • Lead a team of SREs.
  • Build the right processes to run the service and its operation.
  • Work within product team on design and architecture of the product.
  • Provide direction and supervision to the group or groups of engineers responsible for:
    • Coding
    • Testing
    • Test automation
    • Debugging
    • Reliability
    • Performance analysis
    • Critical and/or high visibility customers
  • Develop and implement new projects, policies and procedures for the department(s); and ensure that project goals are met.
  • Develop an annual budget collaboratively with senior management.
  • Utilize previously acquired technical experience to become actively involved in the day-to-day projects to meet schedules and resolve problems.
  • Take responsibility for results, including costs, methods and staffing.
  • Oversee tasks within a large group or department
  • Understand business decisions and its implications as it relates to operational and financial matters.
  • Leverage expertise to resolve issues diverse in scope through short-term and mid-term planning.
  • Work effectively with all levels of staff including vice presidents throughout the organization.

Leverage leadership skills and NetApp managerial tools to professional develop subordinates within their role and career.

Job Requirements

  • Strong oral and written communication skills are essential.
  • Clear understanding of the product development cycle, technical requirements and project management.
  • Strong understanding of concepts related to computer architecture, data structures and programming practices.
  • Experience in software development.
  • Experience with developing budgets and predicting project costs.
  • Demonstrated ability to manage professional level employees.
  • Experience with microservices architecture and one or more of the following tools:
    • SaltStack
    • Kubernetes
    • Terraform
    • Ansible
    • Containers
    • Docker
    • Puppet
    • Chef
  • Experience with cloud provider platforms: AWS or Azure OR Google Cloud Platform.
  • Experience with REST interface.
  • Experience with some of the following tools: HA, Scaleout, Security, N/W, Linux.
  • Experience with APM and the following tools: New Relic, DataDog.


- A minimum of 8 years of experience as an individual contributor and 1 to 5 years as a people manager is required. 
- A Bachelor of Science Degree in Electrical Engineering or Computer Science, a Master Degree, or a PhD; or equivalent experience is required.
- Demonstrated ability to manage multiple projects is required.

Equal Opportunity Employer Minorities/Women/Vets/Disabled.

Nearest Major Market: Waltham
Nearest Secondary Market: Boston

Job Segment: Engineer, Electrical, Product Development, Computer Science, Linux, Engineering, Research, Technology

Apply now »