Latest in AI

Showing:transformerStudentsClear ×

🔥 Trending today

anthropic4 open-source3 amazon3 ai-regulation2 government-policy2 export-controls2 geopolitics2 privacy2 python-packaging2 webassembly2

Topic

Release New Tool Tutorial Business Paper Benchmark Opinion Regulation

For

General Developers Designers Product Founders Marketing Researchers Students

Tiny hackable CUDA language model implementation
Hacker News (AI keywords)9 days agoNew Tool
This GitHub project implements a compact generative pretrained transformer as an autoregressive byte-level sequence model. Its README describes causal self-attention, RoPE, feed-forward layers, AdamW, cross-entropy training, and BLAS/OpenBLAS-backed matrix operations, with CUDA toolkit listed in setup steps. It is most useful as an educational and experimental codebase, not as a production-grade replacement for large commercial LLMs.
從零開始在 nanoVLM 中實作 KV Cache★ 75
Hugging Face Blog375 days agoTutorial
In the inference process of large language models (LLMs) and vision-language models (VLMs), autoregressive decoding is a major performance bottleneck. Each…
你也能設計出最先進的 Transformer 位置編碼：從直覺到 RoPE 的數學推導★ 75
Hugging Face Blog566 days agoTutorial
This educational article from Hugging Face aims to guide readers — in the most intuitive, step-by-step way — to "reinvent" RoPE (Rotary Position Embedding)…
BERT 101：最先進的 NLP 模型完整原理解析
Hugging Face Blog1,565 days agoTutorial
BERT (Bidirectional Encoder Representations from Transformers) is a landmark natural language processing (NLP) model proposed by Google in 2018. This Hugging…