Title: Site Reliability Engineer
IE
Job Summary
NetApp is looking for a Senior TechOps Engineer to join our growing Instaclustr team in Europe. NetApp’s Instaclustr offering provides open source as-a-service company, delivering reliability at scale. We manage cutting edge open-source technologies (Cassandra, Kafka, PostgreSQL, Redis/Valkey, OpenSearch, Postgres, ClickHouse and Cadence) for our customers around the world. NetApp Instaclustr makes it easy for our customers to run powerful open-source applications at the highest levels of scale. We have developed a platform that takes care of the whole lifecycle: provisioning infrastructure, installing applications, and, most importantly, keeping the applications running reliably in production. Since being founded in 2013, Instaclustr has grown strongly, with over 300 customers worldwide, and over 25,000 nodes under management.
Our Technical Operations Engineers are the frontline team keeping our large fleet of cloud-hosted open source clusters up and running. Your work will ensure the security, reliability and performance of world class systems and databases. You will collaborate with our customer’s technical teams, from globally recognised companies in the gaming, banking and logistics industry sectors, ranging from big multinationals to emerging start-ups.
The Role
If you have excellent operational knowledge in managing Apache Cassandra clusters, look no further!!
As a Senior TechOps Engineer (Cassandra), you will be part of the frontline team responsible for reliability, availability, and maintenance of our large fleet of cloud-hosted Cassandra clusters. Every day, you will diagnose and solve interesting technical problems, providing Cassandra as a Managed Service in a highly automated environment. Our service is relied on by some of the leading global names in Banking and Financial Services, Telecom, IoT, and Tech companies that interact with millions of end users.
Job Requirements
We're looking for smart engineers with exceptional communication skills, a positive attitude, and a passion for IT and learning new things. We expect you to be, or quickly become proficient in a range of the technologies we use. Successful candidates for this role will:
- Have strong experience in Cassandra administration, architecture, and a desire to learn more and develop to an actual expert level.
- You should possess experience in diagnosing and recommending mitigation strategies for a range of Cassandra-related issues, including performance degradation due to resource bottlenecks, suboptimal data modelling leading to hot partitions, excessive tombstones, and inefficiencies caused by range slices and poorly constructed queries.
- The ideal candidate should have hands-on experience with Cassandra architecture and core administrative tasks, including compactions, repairs, backup and recovery, resolving schema disagreements, and managing configurations.
- Candidate should have experience in handling Cassandra maintenance activities such as upgrades and migrations.
- Strong knowledge and experience with Linux, and comfortable working from the command line (essential)
- Exceptional ability to communicate clearly and professionally in written and verbal English (essential).
- Preferably have past IT Customer service/support experience in an ITIL-based setup.
- Good fundamental Computer science/software engineering skills and knowledge, particularly Operating System internals, memory management, and networking.
- Ability to follow required processes and procedures.
- Work as part of a team and use your initiative to get things done.
- Preference will be given to candidates with the ability to investigate/research Cassandra issues by reviewing the Apache Cassandra codebase.
- Knowledge of any public cloud technologies like AWS, Docker, and Ansible will be a great addition.
- Programming skills in Python or Java, and source code control using Git would be a plus. (good to have)
- You must be Resident of Ireland with a right to work
I'm interested. What else will I be doing?
- Provide expert operational support to our nodes running in the cloud (AWS, Azure and GCP), using technologies such as Linux (Debian), Docker, and languages including Java, Python and bash. Liaise with our customers’ engineers in resolving interesting issues related to Cassandra usage and other supported technologies.
- Participate in on-call Level1 or Level 2 roster
- Liaise with our customers’ engineers in resolving interesting issues related to Cassandra
- Undertake complex cluster operations such as migrations, upgrades and maintenance on our fleet
- Develop and continually improve our suite of internal automation tools, applications, and processes
Compensation:
The salary offered will be determined by the candidate's location, qualifications, experience, and education and may be outside of this range. Final compensation packages are competitive and in line with industry standards, reflecting a variety of factors, and include a comprehensive benefits package. This may cover Health Insurance, Life Insurance, Retirement or Pension Plans, Paid Time Off, various Leave options, Performance-Based Incentives, employee stock purchase plan, and/or restricted stocks (RSU’s), with all offerings subject to regional variations and governed by local laws, regulations, and company policies. Benefits may vary by country and region, and further details will be provided as part of the recruitment process.
Job Segment:
Open Source, Cloud, Computer Science, Developer, Linux, Technology