Scan to download
BTC $77,290.91 -1.93%
ETH $2,302.42 -2.99%
BNB $626.57 -1.59%
XRP $1.40 -2.14%
SOL $84.76 -2.45%
TRX $0.3253 +0.64%
DOGE $0.0989 -0.18%
ADA $0.2481 -1.79%
BCH $449.36 -1.35%
LINK $9.31 -1.95%
HYPE $41.69 -2.04%
AAVE $97.57 +0.83%
SUI $0.9344 -1.41%
XLM $0.1660 -3.14%
ZEC $353.55 -0.92%
BTC $77,290.91 -1.93%
ETH $2,302.42 -2.99%
BNB $626.57 -1.59%
XRP $1.40 -2.14%
SOL $84.76 -2.45%
TRX $0.3253 +0.64%
DOGE $0.0989 -0.18%
ADA $0.2481 -1.79%
BCH $449.36 -1.35%
LINK $9.31 -1.95%
HYPE $41.69 -2.04%
AAVE $97.57 +0.83%
SUI $0.9344 -1.41%
XLM $0.1660 -3.14%
ZEC $353.55 -0.92%

DeepSeek launches NSA for ultra-fast long-context training and inference

2025-02-18 16:37:45
Collection

ChainCatcher news, according to Jin10, DeepSeek has launched NSA.

DeepSeek claims that NSA is a hardware-consistent and natively trainable sparse attention mechanism designed for ultra-fast long-context training and inference. By optimizing the design for modern hardware, NSA accelerates inference speed while reducing pre-training costs without compromising performance.

In general benchmarks, long-context tasks, and instruction-based reasoning, its performance is comparable to or even better than that of full attention models.

Related tags
Related tags
app_icon
ChainCatcher Building the Web3 world with innovations.