Category Archives: Large Language Models

~30% Compression Of LLM (Flan-T5-Base) With Low Rank Decomposition Of Attention Weight Matrices

Posted on May 23, 2023 by Siddharth Sharma

Colab Link To Reproduce Experiment: LLM Compression Via Low Rank Decomposition.ipynb Context A neural network contains many dense layers which perform matrix multiplication. In the case of Transformers, Attention module has Key, Query, Value and Output matrices (along with the … Continue reading →

Posted in Large Language Models, llm, machine learning | Tagged large language model, machine learning | Leave a comment

	Revamping Dual Encod… on Feature Fusion For The Un…
	Neural Ranking Archi… on Feature Fusion For The Un…
	Neural Ranking Archi… on Talk On Multi Stage Ranki…
	Graph Neural Network… on Attribute Discovery For E-Comm…
	Siddharth Sharma on CTR Prediction System –…

Category Archives: Large Language Models

~30% Compression Of LLM (Flan-T5-Base) With Low Rank Decomposition Of Attention Weight Matrices

Recent Posts

Recent Comments

Archives

Categories

Meta