Author Archives: Siddharth Sharma

About Siddharth Sharma

Interested in NLP, Retrieval & Ranking Models, Content Understanding and Predictive Analytics.

Real Time Inferencing Of Deep Learning Models

Posted on October 24, 2024 by Siddharth Sharma

Posted in Uncategorized | Leave a comment

AWS Blog In Collaboration With Nvidia – Optimizing Inference For Seq2Seq And Encoder Only Models Using Nvidia GPU And Triton Model Server

Posted on November 22, 2023 by Siddharth Sharma

Blurb: Deep Learning Transformer models are complex in architecture and can have hundreds of millions (or even billions) of parameters, which leads to slow real time inference. Real time low latency inference of Deep Learning models is a critical requirement … Continue reading →

Posted in Uncategorized | Tagged AWS, BART, GPU, Low Latency, Model Inferencing, Model Server, Nvidia, Sagemaker, SEQ2SEQ, TensorRT, Triton | Leave a comment

~30% Compression Of LLM (Flan-T5-Base) With Low Rank Decomposition Of Attention Weight Matrices

Posted on May 23, 2023 by Siddharth Sharma

Colab Link To Reproduce Experiment: LLM Compression Via Low Rank Decomposition.ipynb Context A neural network contains many dense layers which perform matrix multiplication. In the case of Transformers, Attention module has Key, Query, Value and Output matrices (along with the … Continue reading →

Posted in Large Language Models, llm, machine learning | Tagged large language model, machine learning | Leave a comment

Adapter Based Fine Tuning BART And T5-Flan-XXL For Single Word Spell Correction

Posted on May 11, 2023 by Siddharth Sharma

In this post I share results of a weekend project around fine tuning BART and T5 Flan models for sequence to sequence generation. I have used common misspellings in English language (single words) for training and evaluating the models. As … Continue reading →

Posted in Uncategorized | Tagged large language model, llm, lora, machine learning, nlp, spell correction | Leave a comment

Revamping Dual Encoder Model Architecture: A layered approach to fuse multi-modal features and plug-and-play integration of Encoders

Posted on April 20, 2023 by Siddharth Sharma

Code examples of feature fusion techniques and tower encoders in last half of the blog In Embedding Based Retrieval(EBR) we create embedding of search query in an online manner and then find k-nearest neighbors of the query vector in an … Continue reading →

Posted in Uncategorized | Leave a comment

Summary Of Adapter Based Performance Efficient Fine Tuning (PEFT) Techniques For Large Language Models

Posted on April 11, 2023 by Siddharth Sharma

The two most common transfer learning techniques in NLP were feature-based transfer (generating input text embedding from a pre-trained large model and using it as a feature in your custom model) and fine-tuning (fine tuning the pre-trained model on custom … Continue reading →

Posted in performance efficient fine tuning, Uncategorized | Tagged adapters, gpt, large language model, llama, lora, machine learning, nlp, peft | Leave a comment

Neural Ranking Architectures

Posted on April 5, 2023 by Siddharth Sharma

Glimpses On Implicit/Explicit, Dense/Sparse, Gated/Non Gated, Low Rank And Many More Layered Interactions 101 Ranking Model Architecture Neural ranking models are the most important component in multi stage retrieval and ranking pipeline. Whether it is e-commerce search, ads targeting, music search or … Continue reading →

Posted in machine learning, ranking models | Tagged machine learning | Leave a comment

Feature Fusion For The Uninitiated

Posted on January 7, 2023 by Siddharth Sharma

Consider a typical e-commerce product. It would have a variety of content specific features like product title, brand, thumbnail etc and other engagement driven features like number of clicks, click through rate etc. Any machine learning model ingesting features of … Continue reading →

Posted in Uncategorized | 2 Comments

Graph Neural Networks Based Attribute Discovery For E-Commerce Taxonomy Expansion

Posted on December 31, 2022 by Siddharth Sharma

Previous post on Attribute Discovery In Part 1 of Attribute Discovery we discussed unsupervised approaches that used Graph based Keyword and Key Phrase extraction algorithms to generate a list of candidate tokens that can be potential attributes missing from e-commerce … Continue reading →

Posted in Uncategorized | Leave a comment

Attribute Discovery For E-Commerce Taxonomy Expansion – Part 1 Unsupervised Graph Based Keyword Extraction

Posted on December 30, 2022 by Siddharth Sharma

During my time at Facebook Marketplace I worked at a very esoteric problem of semi automating attribute discovery i.e. finding granular attribute values from product titles and description that are not present in the Product Attribute Taxonomy. Each category in … Continue reading →

Posted in Uncategorized | 1 Comment

	Revamping Dual Encod… on Feature Fusion For The Un…
	Neural Ranking Archi… on Feature Fusion For The Un…
	Neural Ranking Archi… on Talk On Multi Stage Ranki…
	Graph Neural Network… on Attribute Discovery For E-Comm…
	Siddharth Sharma on CTR Prediction System –…

Author Archives: Siddharth Sharma

About Siddharth Sharma

Real Time Inferencing Of Deep Learning Models

AWS Blog In Collaboration With Nvidia – Optimizing Inference For Seq2Seq And Encoder Only Models Using Nvidia GPU And Triton Model Server

~30% Compression Of LLM (Flan-T5-Base) With Low Rank Decomposition Of Attention Weight Matrices

Adapter Based Fine Tuning BART And T5-Flan-XXL For Single Word Spell Correction

Revamping Dual Encoder Model Architecture: A layered approach to fuse multi-modal features and plug-and-play integration of Encoders

Summary Of Adapter Based Performance Efficient Fine Tuning (PEFT) Techniques For Large Language Models

Neural Ranking Architectures

Feature Fusion For The Uninitiated

Graph Neural Networks Based Attribute Discovery For E-Commerce Taxonomy Expansion

Attribute Discovery For E-Commerce Taxonomy Expansion – Part 1 Unsupervised Graph Based Keyword Extraction

Recent Posts

Recent Comments

Archives

Categories

Meta