扫码下载
BTC $77,134.25 -1.61%
ETH $2,294.03 -3.17%
BNB $625.57 -1.57%
XRP $1.40 -2.08%
SOL $84.62 -2.44%
TRX $0.3251 +0.41%
DOGE $0.0988 -0.09%
ADA $0.2475 -1.72%
BCH $448.63 -1.30%
LINK $9.27 -2.08%
HYPE $41.54 -2.22%
AAVE $97.53 +0.97%
SUI $0.9329 -1.28%
XLM $0.1656 -3.06%
ZEC $354.39 -0.10%
BTC $77,134.25 -1.61%
ETH $2,294.03 -3.17%
BNB $625.57 -1.57%
XRP $1.40 -2.08%
SOL $84.62 -2.44%
TRX $0.3251 +0.41%
DOGE $0.0988 -0.09%
ADA $0.2475 -1.72%
BCH $448.63 -1.30%
LINK $9.27 -2.08%
HYPE $41.54 -2.22%
AAVE $97.53 +0.97%
SUI $0.9329 -1.28%
XLM $0.1656 -3.06%
ZEC $354.39 -0.10%

DeepSeek 推出 NSA,用于超快速的长上下文训练和推理

2025-02-18 16:37:45
收藏

ChainCatcher 消息,据金十报道,DeepSeek 推出 NSA。

DeepSeek 称,NSA 是一种与硬件一致且本机可训练的稀疏注意力机制,用于超快速的长上下文训练和推理。通过针对现代硬件的优化设计,NSA 加快了推理速度,同时降低了预训练成本,而不会影响性能。

在一般基准测试、长上下文任务和基于指令的推理上,它的表现与完全注意力模型相当甚至更好。

关联标签
关联标签
app_icon
ChainCatcher 与创新者共建Web3世界