Hugging Face BlogJan 4, 2024, 12:00 AM

迎來 aMUSEd:高效的輕量級 Text-to-Image 文本生成圖像模型

Original: Welcome aMUSEd: Efficient Text-to-Image Generation

The Hugging Face official blog formally introduced a brand-new open-source text-to-image model called "aMUSEd." This model is based on a…

Hugging Face 發表了名為 aMUSEd 的開源文字生成圖片模型,基於 Google 的 MUSE 架構。與主流的擴散模型(Diffusion Models)不同,aMUSEd 採用遮罩圖像建模(MIM)技術,僅需 12 個步驟即可生成圖像。其參數規模僅約 8 億,非常適合在消費級硬體上進行快速推理與微調,並支援圖生圖與局部重繪。

The Hugging Face official blog formally introduced a brand-new open-source text-to-image model called "aMUSEd." This model is based on a reproduction and optimization of the MUSE architecture previously proposed by Google. Unlike the mainstream diffusion models such as Stable Diffusion, aMUSEd employs Masked Image Modeling (MIM) technology. This non-autoregressive architecture allows it to generate images without going through dozens of denoising steps as diffusion models do — instead, it can predict complete image features in as few as 12 inference steps, achieving extremely fast generation speeds.

Full summary

Free shows the 3-line summary; Pro unlocks the full deep summary (~300 words) so you never have to click through.

See Pro plans →

Want the original English / full article?

Read on Hugging Face Blog →

Summaries are AI-generated; the original article is authoritative.