smashinggradient
Just another WordPress.com site
Skip to content
  • Home
  • About

Monthly Archives: October 2024

Real Time Inferencing Of Deep Learning Models

Posted on October 24, 2024 by Siddharth Sharma
Posted in Uncategorized | Leave a comment
  • Recent Posts

    • Real Time Inferencing Of Deep Learning Models
    • AWS Blog In Collaboration With Nvidia – Optimizing Inference For Seq2Seq And Encoder Only Models Using Nvidia GPU And Triton Model Server
    • ~30% Compression Of LLM (Flan-T5-Base) With Low Rank Decomposition Of Attention Weight Matrices
    • Adapter Based Fine Tuning BART And T5-Flan-XXL For Single Word Spell Correction
    • Revamping Dual Encoder Model Architecture: A layered approach to fuse multi-modal features and plug-and-play integration of Encoders
  • Recent Comments

    Revamping Dual Encod… on Feature Fusion For The Un…
    Neural Ranking Archi… on Feature Fusion For The Un…
    Neural Ranking Archi… on Talk On Multi Stage Ranki…
    Graph Neural Network… on Attribute Discovery For E-Comm…
    Siddharth Sharma's avatarSiddharth Sharma on CTR Prediction System –…
  • Archives

    • October 2024
    • November 2023
    • May 2023
    • April 2023
    • January 2023
    • December 2022
    • September 2021
    • August 2020
    • February 2018
    • February 2017
    • April 2015
    • February 2012
    • November 2011
  • Categories

    • ctr
    • Large Language Models
    • llm
    • machine learning
    • online ads
    • performance efficient fine tuning
    • ranking models
    • Uncategorized
  • Meta

    • Create account
    • Log in
    • Entries feed
    • Comments feed
    • WordPress.com
smashinggradient
Blog at WordPress.com.
  • Subscribe Subscribed
    • smashinggradient
    • Already have a WordPress.com account? Log in now.
    • smashinggradient
    • Subscribe Subscribed
    • Sign up
    • Log in
    • Report this content
    • View site in Reader
    • Manage subscriptions
    • Collapse this bar