Close Menu
    Facebook LinkedIn YouTube WhatsApp X (Twitter) Pinterest
    Trending
    • June deadline approaches for Hawthorne sale process
    • Today’s NYT Mini Crossword Answers for June 4
    • New tiny nudibranch species discovered in Taiwan
    • Why the Budget’s CGT changes are a disaster for angel investors and startups
    • OpenAI and Anthropic Sign Letter to Prevent AI-Developed Biological Weapons
    • New York sports betting statements bill advances
    • SwitchBot Launches the Most Complete Home Weather Station I’ve Seen
    • What It Takes for Future-Ready Power Distribution
    Facebook LinkedIn WhatsApp
    Times FeaturedTimes Featured
    Thursday, June 4
    • Home
    • Founders
    • Startups
    • Technology
    • Profiles
    • Entrepreneurs
    • Leaders
    • Students
    • VC Funds
    • More
      • AI
      • Robotics
      • Industries
      • Global
    Times FeaturedTimes Featured
    Home»Artificial Intelligence»2-bit VPTQ: 6.5x Smaller LLMs while Preserving 95% Accuracy
    Artificial Intelligence

    2-bit VPTQ: 6.5x Smaller LLMs while Preserving 95% Accuracy

    Editor Times FeaturedBy Editor Times FeaturedJanuary 31, 2025No Comments1 Min Read
    Facebook Twitter Pinterest Telegram LinkedIn Tumblr WhatsApp Email
    Share
    Facebook Twitter LinkedIn Pinterest Telegram Email WhatsApp Copy Link


    Very correct 2-bit quantization for operating 70B LLMs on a 24 GB GPU

    Towards Data Science

    Generated with ChatGPT

    Latest developments in low-bit quantization for LLMs, like AQLM and AutoRound, at the moment are displaying acceptable ranges of degradation in downstream duties, particularly for big fashions. That stated, 2-bit quantization nonetheless introduces noticeable accuracy loss typically.

    One promising algorithm for low-bit quantization is VPTQ (MIT license), proposed by Microsoft. It was launched in October 2024 and has since proven glorious efficiency and effectivity in quantizing massive fashions.

    On this article, we are going to:

    1. Evaluation the VPTQ quantization algorithm.
    2. Reveal how one can use VPTQ fashions, a lot of that are already obtainable. As an illustration, we will simply discover low-bit variants of Llama 3.3 70B, Llama 3.1 405B, and Qwen2.5 72B.
    3. Consider these fashions and focus on the outcomes to know when VPTQ fashions could be a good selection for LLMs in manufacturing.

    Remarkably, 2-bit quantization with VPTQ virtually achieves efficiency corresponding to the unique 16-bit mannequin on duties akin to MMLU. Furthermore, it allows operating Llama 3.1 405B on a single GPU, whereas utilizing much less reminiscence than a 70B mannequin!



    Source link

    Share. Facebook Twitter Pinterest LinkedIn Tumblr Email
    Editor Times Featured
    • Website

    Related Posts

    I Built a C++ Backend So My GPU Would Stop Eating Air

    June 3, 2026

    I Spent May Evaluating Different Engines for OCR

    June 3, 2026

    Why AI Is NOT Stealing Your Job

    June 3, 2026

    What AI Agents Should Never Do on Their Own

    June 3, 2026

    Exploring Income Patterns with Python Pandas, Matplotlib, and Seaborn

    June 2, 2026

    From Local App to Public Website in Minutes

    June 2, 2026

    Comments are closed.

    Editors Picks

    June deadline approaches for Hawthorne sale process

    June 4, 2026

    Today’s NYT Mini Crossword Answers for June 4

    June 4, 2026

    New tiny nudibranch species discovered in Taiwan

    June 4, 2026

    Why the Budget’s CGT changes are a disaster for angel investors and startups

    June 4, 2026
    Categories
    • Founders
    • Startups
    • Technology
    • Profiles
    • Entrepreneurs
    • Leaders
    • Students
    • VC Funds
    About Us
    About Us

    Welcome to Times Featured, an AI-driven entrepreneurship growth engine that is transforming the future of work, bridging the digital divide and encouraging younger community inclusion in the 4th Industrial Revolution, and nurturing new market leaders.

    Empowering the growth of profiles, leaders, entrepreneurs businesses, and startups on international landscape.

    Asia-Middle East-Europe-North America-Australia-Africa

    Facebook LinkedIn WhatsApp
    Featured Picks

    advanced tech for adventure touring

    October 8, 2025

    Tech Life – Dealing with cyber attacks

    October 7, 2025

    Google’s Will Smith double is better at eating AI spaghetti … but it’s crunchy?

    May 24, 2025
    Categories
    • Founders
    • Startups
    • Technology
    • Profiles
    • Entrepreneurs
    • Leaders
    • Students
    • VC Funds
    Copyright © 2024 Timesfeatured.com IP Limited. All Rights.
    • Privacy Policy
    • Disclaimer
    • Terms and Conditions
    • About us
    • Contact us

    Type above and press Enter to search. Press Esc to cancel.