Monday, December 22, 2025
  • Login
SB Crypto Guru News- latest crypto news, NFTs, DEFI, Web3, Metaverse
No Result
View All Result
  • HOME
  • BITCOIN
  • CRYPTO UPDATES
    • GENERAL
    • ALTCOINS
    • ETHEREUM
    • CRYPTO EXCHANGES
    • CRYPTO MINING
  • BLOCKCHAIN
  • NFT
  • DEFI
  • WEB3
  • METAVERSE
  • REGULATIONS
  • SCAM ALERT
  • ANALYSIS
CRYPTO MARKETCAP
  • HOME
  • BITCOIN
  • CRYPTO UPDATES
    • GENERAL
    • ALTCOINS
    • ETHEREUM
    • CRYPTO EXCHANGES
    • CRYPTO MINING
  • BLOCKCHAIN
  • NFT
  • DEFI
  • WEB3
  • METAVERSE
  • REGULATIONS
  • SCAM ALERT
  • ANALYSIS
No Result
View All Result
SB Crypto Guru News- latest crypto news, NFTs, DEFI, Web3, Metaverse
No Result
View All Result

TEAL Introduces Training-Free Activation Sparsity to Boost LLM Efficiency

by SB Crypto Guru News
September 1, 2024
in Blockchain
Reading Time: 2 mins read
0 0
A A
0




Zach Anderson
Sep 01, 2024 08:34

TEAL offers a training-free approach to activation sparsity, significantly enhancing the efficiency of large language models (LLMs) with minimal degradation.



TEAL Introduces Training-Free Activation Sparsity to Boost LLM Efficiency

TEAL (Training-Free Activation Sparsity in LLMs) has emerged as a groundbreaking approach to improve the efficiency of large language models (LLMs) without requiring additional training. According to together.ai, this method applies magnitude pruning to hidden states throughout the model, achieving 40-50% activation sparsity with minimal degradation. This innovation allows for the transfer of fewer weights to on-chip memory, addressing the memory-bound nature of LLM inference and translating into 1.53-1.8x wall-clock speedups in single-batch decoding.

Background

LLMs are known for their massive size, which poses challenges during inference, primarily due to the speed limitations of transferring parameters from device memory to registers. Various techniques such as quantization, weight sparsity, and speculative decoding have been developed to tackle this ‘memory wall’. Activation sparsity, which leverages zero values in hidden states, is a less explored method that avoids transferring unnecessary weight channels during decoding.

Older models like OPT-175B show high activation sparsity, enabling methods like DejaVu to achieve significant speedups. However, newer models like LLaMA have moved to SwiGLU variants, making it harder to apply such methods. Recent research has attempted to ‘recover’ models that exhibit activation sparsity, but these require extensive retraining on massive datasets.

Motivating Study: Distributional Properties of Activations in LLMs

Research has shown that hidden states in LLMs exhibit outliers and are zero-centered with similar distributional shapes across layers. Specifically, states before MLP and Attention Blocks are Gaussian-shaped, while intermediate states are Laplacian-shaped. This suggests that many low-magnitude activations can be pruned with negligible model degradation, a concept also observed in other studies like CATS.

TEAL

TEAL introduces an optimization by sparsifying every tensor in the model, achieving near-zero degradation at 25% sparsity and minimal degradation at 40% sparsity. At 50% sparsity, Llama-3 variants show slightly more degradation compared to older Llama-2 and Mistral variants. TEAL outperforms CATS by sparsifying every tensor and choosing to sparsify through input, yielding lower error.

Hardware-Aware Speed-up

To benchmark real-world speedups, TEAL was integrated with GPT-Fast, achieving significant speedups of up to 1.53x and 1.8x at 40% and 50% sparsity, respectively. While the kernel is faster than cuBLAS at 0% sparsity, there is still room for further optimization.

Compatibility with Quantization

TEAL also demonstrates compatibility with quantization, another technique for efficient LLM inference. Combining activation sparsity and quantization unlocks new regimes for transferring memory to GPU registers, allowing for higher inference speed-ups.

Applications

TEAL’s most immediate application is accelerating inference in resource-constrained edge settings, particularly in single-batch scenarios. It also aids inference providers like Together AI, which hosts over 100 open-source models across a large fleet of GPUs, by serving models more efficiently.

Image source: Shutterstock




Source link

Tags: activationBitcoin NewsBoostCrypto NewsCrypto UpdatesefficiencyIntroducesLatest News on CryptoLLMSB Crypto Guru NewsSparsityTEALTrainingFree
Previous Post

Is Bitcoin (BTC) Headed For A Deeper Correction? $56K Breakdown Could Spell Trouble

Next Post

Gaming Blockchain Oasys Announces Partnership With Japanese Conglomerate SBI Holdings

Related Posts

Harvey Adopts MCP for Enhanced Legal Tool Integration

Harvey Adopts MCP for Enhanced Legal Tool Integration

by SB Crypto Guru News
December 22, 2025
0

Felix Pinkston Dec 22, 2025 13:22 Harvey integrates the Model Context Protocol (MCP) to streamline legal workflows, offering users greater...

WLD Price Prediction: alt=

WLD Price Prediction: $0.67 Target by January 2025 as Worldcoin Tests Critical Support

by SB Crypto Guru News
December 21, 2025
0

Tony Kim Dec 21, 2025 13:15 WLD price prediction shows potential recovery to $0.67 resistance if $0.47 support holds, with...

MATIC Price Prediction: alt=

MATIC Price Prediction: $0.45-0.52 Target Within 6 Weeks as Polygon Eyes $0.58 Resistance Break

by SB Crypto Guru News
December 20, 2025
0

Timothy Morano Dec 20, 2025 13:33 MATIC price prediction suggests 18-37% upside potential to $0.45-$0.52 range if Polygon breaks key...

Top Blockchain Security Threats Every Web3 Professional Must Understand

Top Blockchain Security Threats Every Web3 Professional Must Understand

by SB Crypto Guru News
December 19, 2025
0

Blockchain induced a massive wave of innovation in the technological landscape, redefining how users control their data and interact with...

Revolutionizing Crypto Markets: Glassnode Introduces Taker-Flow-Based Gamma Exposure

Revolutionizing Crypto Markets: Glassnode Introduces Taker-Flow-Based Gamma Exposure

by SB Crypto Guru News
December 19, 2025
0

Peter Zhang Dec 19, 2025 02:52 Glassnode unveils a new Gamma Exposure metric for crypto options, aiming to provide insights...

Load More
Next Post
Gaming Blockchain Oasys Announces Partnership With Japanese Conglomerate SBI Holdings

Gaming Blockchain Oasys Announces Partnership With Japanese Conglomerate SBI Holdings

Rollups Are ‘Copies of the EVM,’ Not True Scaling Solutions, Says Developer

Rollups Are ‘Copies of the EVM,’ Not True Scaling Solutions, Says Developer

Facebook Twitter LinkedIn Tumblr RSS

CATEGORIES

  • Altcoin
  • Analysis
  • Bitcoin
  • Blockchain
  • Crypto Exchanges
  • Crypto Updates
  • DeFi
  • Ethereum
  • Metaverse
  • Mining
  • NFT
  • Regulations
  • Scam Alert
  • Uncategorized
  • Web3

SITE MAP

  • Disclaimer
  • Privacy Policy
  • DMCA
  • Cookie Privacy Policy
  • Terms and Conditions
  • Contact us

Copyright © 2022 - SB Crypto Guru News.
SB Crypto Guru News is not responsible for the content of external sites.

Welcome Back!

Login to your account below

Forgotten Password?

Retrieve your password

Please enter your username or email address to reset your password.

Log In
No Result
View All Result
  • HOME
  • BITCOIN
  • CRYPTO UPDATES
    • GENERAL
    • ALTCOINS
    • ETHEREUM
    • CRYPTO EXCHANGES
    • CRYPTO MINING
  • BLOCKCHAIN
  • NFT
  • DEFI
  • WEB3
  • METAVERSE
  • REGULATIONS
  • SCAM ALERT
  • ANALYSIS

Copyright © 2022 - SB Crypto Guru News.
SB Crypto Guru News is not responsible for the content of external sites.