Tag Archives: machine learning

~30% Compression Of LLM (Flan-T5-Base) With Low Rank Decomposition Of Attention Weight Matrices

Colab Link To Reproduce Experiment: LLM Compression Via Low Rank Decomposition.ipynb Context A neural network contains many dense layers which perform matrix multiplication. In the case of Transformers, Attention module has Key, Query, Value and Output matrices (along with the … Continue reading

Posted in Large Language Models, llm, machine learning | Tagged , | Leave a comment

Adapter Based Fine Tuning BART And T5-Flan-XXL For Single Word Spell Correction

In this post I share results of a weekend project around fine tuning BART and T5 Flan models for sequence to sequence generation. I have used common misspellings in English language (single words) for training and evaluating the models. As … Continue reading

Posted in Uncategorized | Tagged , , , , , | Leave a comment

Summary Of Adapter Based Performance Efficient Fine Tuning (PEFT) Techniques For Large Language Models

The two most common transfer learning techniques in NLP were feature-based transfer (generating input text embedding from a pre-trained large model and using it as a feature in your custom model) and fine-tuning (fine tuning the pre-trained model on custom … Continue reading

Posted in performance efficient fine tuning, Uncategorized | Tagged , , , , , , , | Leave a comment

Neural Ranking Architectures

Glimpses On Implicit/Explicit, Dense/Sparse, Gated/Non Gated, Low Rank And Many More Layered Interactions 101 Ranking Model Architecture Neural ranking models are the most important component in multi stage retrieval and ranking pipeline. Whether it is e-commerce search, ads targeting, music search or … Continue reading

Posted in machine learning, ranking models | Tagged | Leave a comment