Hugging Face BlogDec 19, 2024, 12:00 AMimportant 85

終於迎來 BERT 的繼承者:ModernBERT 正式發布

Original: Finally, a Replacement for BERT: Introducing ModernBERT

Despite the recent dominance of generative decoder models (such as GPT and Llama), encoder-only models (such as BERT) remain indispensable…

Hugging Face 與 LightOn 等團隊聯合推出 ModernBERT,旨在取代已問世 6 年但仍被廣泛使用的 BERT 模型。ModernBERT 採用現代化架構,將上下文長度從 512 提升至 8192 標記,並原生支援 FlashAttention-2 與 RoPE。在保持極高推理速度與低記憶體佔用的同時,其在檢索、分類與嵌入等任務上的表現全面超越 DeBERTa-v3,為 RAG 與搜尋系統注入全新動力。

Despite the recent dominance of generative decoder models (such as GPT and Llama), encoder-only models (such as BERT) remain indispensable behind the scenes for tasks like search, information retrieval (RAG), text classification, and named entity recognition (NER). However, the classic BERT architecture has grown outdated since its 2018 release — it supports only a 512-token context window and lacks support for modern hardware optimization techniques.

Full summary

Free shows the 3-line summary; Pro unlocks the full deep summary (~300 words) so you never have to click through.

See Pro plans →

Want the original English / full article?

Read on Hugging Face Blog →

Summaries are AI-generated; the original article is authoritative.