-
Recent Posts
- Real Time Inferencing Of Deep Learning Models
- AWS Blog In Collaboration With Nvidia – Optimizing Inference For Seq2Seq And Encoder Only Models Using Nvidia GPU And Triton Model Server
- ~30% Compression Of LLM (Flan-T5-Base) With Low Rank Decomposition Of Attention Weight Matrices
- Adapter Based Fine Tuning BART And T5-Flan-XXL For Single Word Spell Correction
- Revamping Dual Encoder Model Architecture: A layered approach to fuse multi-modal features and plug-and-play integration of Encoders
Recent Comments
Archives
Categories
Meta
Monthly Archives: August 2020
QUS : Query Understanding Service
Introduction: The journey of a search query through e-commerce engineering stack can be broadly divided into following phases, search query text processing phase, retrieval phase where relevant products are fetched from indexer and the last but not the least, product … Continue reading
Posted in Uncategorized
Leave a comment