Title: Data and Applied Scientist
Bangalore, Karnataka, IN
Job Summary
NetApp is seeking a Data & Applied Scientist to join the Data Services organization. The overarching vision of this organization is to empower organizations to effectively manage and govern their data estate and build cyber-resiliency while accelerating their digital transformation journey. To get to this vision, we will embark on an AI-first approach to build and deliver world-class suite of data services. As an individual contributor in this team, the Data & Applied Scientist will be responsible for independently developing and shipping ML/AI-based solutions to solve real-world challenges in the domains of data governance and compliance. The candidate will possess deep expertise in using modern AI/ML systems to ship impactful products to production. This is going to be a challenging and a fun role in one of the most exciting roles in the industry today.
Job Requirements
- Lead the development and deployment of AI/ML models and systems for Data governance with techniques from the realm of classical Machine learning, Generative AI and AI agents.
- Develop scalable data pipelines for various AI/ML-driven solutions from building curated data pipelines, setting up automated evals, adopting latest and greatest techniques and approaches platforms for rapid experimentation.
- Collaborate with data scientists and engineers to integrate AI models and systems into the broader products at NetApp. Effectively communicate complex technical artifacts to both technical and non-technical audiences.
- Work with a great deal of autonomy and proactively bring open-source AI innovations into our research and experimentation roadmap.
- Ensure scalability, reliability, and performance of AI models in production environments.
- Have a customer focus mindset and build AI/ML products that delight our customers.
- Represent NetApp as an innovator in the machine learning community and promote the company's product capabilities in industry/academic conferences.
Required and Preferred Qualification
Required Qualification
• Master’s or bachelor’s degree in computer science / applied mathematics / statistics / data science OR equivalent skills. Degrees are not compulsory.
• Expertise in data science fundamentals and model evaluations as evidenced by solid understanding of supervised ML, unsupervised ML and deep learning approaches with 4+ years of experience shipping them to production.
• Strong Proficiency in Python, modern ML frameworks (PyTorch, transformers) and derivative tooling.
• Working expertise with using large language models, prompting techniques, finetuning, synthetic data generation and knowing when not to use LLMs.
• Excellent communication and collaboration skills, with demonstrated ability to work effectively with cross-functional teams and stakeholders of an organization
Preferred Qualification
• Curiosity and tinkering with AI agents or multi-agent systems.
• Understanding of data governance, security policies and compliance frameworks.
• Experience of representing their work or company in AI/ML conferences. Publications or contributions to AI/ML community related to NLP or data governance.
• Active GitHub profile showcasing relevant open-source AI/ML projects or Kaggle achievements.
Job Segment:
Open Source, Computer Science, Database, Scientific, Technology, Research, Engineering