smallstep.ai
Founder (Dec 2023 - Present)
Building Marathi LLM:
- Pretraining and Finetuning of Marathi Llama - Misal 7Bn, 1Bn parameter models.
- End to end development, evaluation and deployment of Marathi Llama.
- Open sourced Instruction tuned model here
- Published a technical blog with entire procedure here
- Launch : Linkedin Post, Twitter Post
- Coverage : YourStory, Moneycontrol, ARB Podcast
- Open Source : Huggingface Space
Medpiper
AI Consultant (Jan 2024 - Jun 2024)
Medical Documents Digitization:
- Led the creation of AI platform dedicated for extraction and digitization of health records.
- Built robust solutions to efficiently handle medical documents in diverse formats across vendors.
- Developed high-performance document extraction system, processing batches with sub-second latency.
- Optimized systems to reduce processing time from 10 minutes to 1 minute, achieving a 10x performance boost.
Tekion
Data Scientist (Aug 2023 - Jul 2024)
Document AI - Table Extraction:
- Developed a robust solution for extracting rows and columns from tables in images
- Trained a mask-RCNN based model to identify the structure of table components
- Reconstruction of table components for consumption
Automobile Service Recommendation:
- Conducted comprehensive EDA on seasonal trends in services taken
- Discovered key associations between different services taken
- Generated recommendations for individual vehicles based on their unique service histories.
- Impact : Significant increase in service add to cart and service taken rates
Pratilipi
Data Scientist (Mar 2021 - Aug 2022)
Modelling user item interactions using autoencoders:
- Leveraging bottle neck vectors as embeddings
- Creating a querying model to fetch similar interactions
- Built a collaborative filtering model
- Impact : High impact in reads and improvements in monetization observed
Creating hooks for increasing interaction at the end of each content:
- Leveraging Author similarity and common interactions of users
- Generating relevant author embeddings to capture category information
- Impact : 2x growth in reads after a user completed a read
Next authors to follow model:
- Applied clustering and probabilistic approach to compute “follow author” recommendations
- Impact : 20% increase in follow action from author profile page
Data Science Intern (Dec 2020 - Feb 2021)
Category personalisation “For you” section:
- Captured category interests and subcategory interests of users
- Personalised various category combinations
- Impact : Multi front impact on top line, monetisation and author follows observed
Conducted multiple experiments in “For you” section of the app:
- Tested multiple hypothesis to increase reads
- Conducted analysis to get insights of user behaviour across multiple geographies