Close Menu
    Facebook LinkedIn YouTube WhatsApp X (Twitter) Pinterest
    Trending
    • EU-Funded Startups Are Powering Europe’s Tech Future
    • Elon Musk’s Feud With President Trump Wipes $152 Billion Off Tesla’s Market Cap
    • Galaxy Lockscreens Can Use AI to Show You in Outfits You Might Want to Buy
    • Getting Past Procastination – IEEE Spectrum
    • How to Design My First AI Agent
    • Manus has kick-started an AI agent boom in China
    • What is the NIOSH Composite Lifting Index?
    • Shimano unveils battery-free auto-shifting bike system
    Facebook LinkedIn WhatsApp
    Times FeaturedTimes Featured
    Thursday, June 5
    • Home
    • Founders
    • Startups
    • Technology
    • Profiles
    • Entrepreneurs
    • Leaders
    • Students
    • VC Funds
    • More
      • AI
      • Robotics
      • Industries
      • Global
    Times FeaturedTimes Featured
    Home»Artificial Intelligence»2-bit VPTQ: 6.5x Smaller LLMs while Preserving 95% Accuracy
    Artificial Intelligence

    2-bit VPTQ: 6.5x Smaller LLMs while Preserving 95% Accuracy

    Editor Times FeaturedBy Editor Times FeaturedJanuary 31, 2025No Comments1 Min Read
    Facebook Twitter Pinterest Telegram LinkedIn Tumblr WhatsApp Email
    Share
    Facebook Twitter LinkedIn Pinterest Telegram Email WhatsApp Copy Link


    Very correct 2-bit quantization for operating 70B LLMs on a 24 GB GPU

    Towards Data Science

    Generated with ChatGPT

    Latest developments in low-bit quantization for LLMs, like AQLM and AutoRound, at the moment are displaying acceptable ranges of degradation in downstream duties, particularly for big fashions. That stated, 2-bit quantization nonetheless introduces noticeable accuracy loss typically.

    One promising algorithm for low-bit quantization is VPTQ (MIT license), proposed by Microsoft. It was launched in October 2024 and has since proven glorious efficiency and effectivity in quantizing massive fashions.

    On this article, we are going to:

    1. Evaluation the VPTQ quantization algorithm.
    2. Reveal how one can use VPTQ fashions, a lot of that are already obtainable. As an illustration, we will simply discover low-bit variants of Llama 3.3 70B, Llama 3.1 405B, and Qwen2.5 72B.
    3. Consider these fashions and focus on the outcomes to know when VPTQ fashions could be a good selection for LLMs in manufacturing.

    Remarkably, 2-bit quantization with VPTQ virtually achieves efficiency corresponding to the unique 16-bit mannequin on duties akin to MMLU. Furthermore, it allows operating Llama 3.1 405B on a single GPU, whereas utilizing much less reminiscence than a 70B mannequin!



    Source link

    Share. Facebook Twitter Pinterest LinkedIn Tumblr Email
    Editor Times Featured
    • Website

    Related Posts

    How to Design My First AI Agent

    June 5, 2025

    Decision Trees Natively Handle Categorical Data

    June 5, 2025

    Landing your First Machine Learning Job: Startup vs Big Tech vs Academia

    June 5, 2025

    Pairwise Cross-Variance Classification | Towards Data Science

    June 5, 2025

    Building a Modern Dashboard with Python and Gradio

    June 5, 2025

    The Journey from Jupyter to Programmer: A Quick-Start Guide

    June 5, 2025

    Comments are closed.

    Editors Picks

    EU-Funded Startups Are Powering Europe’s Tech Future

    June 5, 2025

    Elon Musk’s Feud With President Trump Wipes $152 Billion Off Tesla’s Market Cap

    June 5, 2025

    Galaxy Lockscreens Can Use AI to Show You in Outfits You Might Want to Buy

    June 5, 2025

    Getting Past Procastination – IEEE Spectrum

    June 5, 2025
    Categories
    • Founders
    • Startups
    • Technology
    • Profiles
    • Entrepreneurs
    • Leaders
    • Students
    • VC Funds
    About Us
    About Us

    Welcome to Times Featured, an AI-driven entrepreneurship growth engine that is transforming the future of work, bridging the digital divide and encouraging younger community inclusion in the 4th Industrial Revolution, and nurturing new market leaders.

    Empowering the growth of profiles, leaders, entrepreneurs businesses, and startups on international landscape.

    Asia-Middle East-Europe-North America-Australia-Africa

    Facebook LinkedIn WhatsApp
    Featured Picks

    Demystifying Policy Optimization in RL: An Introduction to PPO and GRPO

    May 27, 2025

    Bike makers invited to ditch the derailleur for combined motor/gearbox

    February 18, 2025

    China Is Investigating Google Over Trump’s Tariffs

    February 4, 2025
    Categories
    • Founders
    • Startups
    • Technology
    • Profiles
    • Entrepreneurs
    • Leaders
    • Students
    • VC Funds
    Copyright © 2024 Timesfeatured.com IP Limited. All Rights.
    • Privacy Policy
    • Disclaimer
    • Terms and Conditions
    • About us
    • Contact us

    Type above and press Enter to search. Press Esc to cancel.