Apply now »

Title:  Site Reliability Engineer (Kafka)

Location: 

Bangalore, Karnataka, IN

Requisition ID:  132781

Job Summary

Our TechOps Engineers are the frontline team keeping our large fleet of cloud-hosted Apache Kafka, Cassandra, OpenSearch, Cadence, Valkey, Clickhouse and PostgreSQL clusters up and running. Every day you will diagnose and solve challenging and interesting technical problems providing a service that is relied on by some of the leading global names in tech to deliver for millions of end users.  Our service is relied on by some of the leading global names in Banking and Financial Services, Telecom, IoT and Tech companies that interact with millions of end users.

This role is for India-based Senior TechOps engineer, primarily focusing on Apache Kafka Opensource technology - that includes operating, maintaining, upgrading and continuously improving the Managed Service for Kafka (across AWS, Azure and GCP) to deliver a great customer experience. This role includes participating in a rotating Level-2 roster.

Skills & Experience

We're looking for smart engineers with exceptional communication skills, a positive attitude, and a passion for IT and learning new things. We expect you to be, or quickly become proficient in a range of the technologies we use. Successful candidates for this role will:

  • Have strong experience in Kafka, and a desire to learn more and develop to a true expert level.
  • Ideally should already have experience diagnosing various operational issues through the analysis of logs /graphs.
  • Past experience with abovementioned tech’s upgrades and migrations would be favourable.
  • Have good experience working on one Public Cloud provider such as AWS, Azure or GCP.
  • Preferably have past IT Customer service/support experience.
  • Good fundamental Computer science / software engineering skills and knowledge, particularly Operating System internals, memory management, and networking.
  • Strong knowledge and experience with Linux and be comfortable working from the command line (essential)
  • Exceptional ability to communicate clearly and professionally in written and verbal English (essential).
  • Work as part of a team and use your initiative to get things done.
  • Ability to follow required processes and procedures.
  • Investigating/researching issues by reviewing the source code.
  • Programming skills in Python or Java, and source code control using Git would be a plus.
  • Be a proactive, reliable and supportive member of the Technical Operations team for Kafka, and participate in a rotating L2 shift roster
  • Provide expert operational support to our nodes running in the cloud (AWS, Azure and GCP), using technologies such as Linux (Debian), Docker, and languages including Java, Python and bash.
  • Liaise with our customers’ engineers in resolving interesting issues related to Apache Kafka usage and other supported technologies.
  • Undertake complex cluster operations such as migrations, upgrades and maintenance on our fleet.
  • Develop and continually improve our suite of internal automation tools, applications, and processes.

Education

  • Typically requires a minimum of 4-8 years of related experience with a Bachelor’s degree or 6 years and a Master’s degree; or a PhD with 3 years experience; or equivalent experience.


Job Segment: Cloud, Software Engineer, Developer, Java, Linux, Technology, Engineering

Apply now »