r/LocalLLaMA top dayJun 7, 2026, 10:07 PM/u/xspider2000

club-3090 Adds Experimental FP8 Support for Qwen3.6-27B

Original: club-3090 adds experimental FP8 support for Qwen3.6-27B!

club-3090 introduces experimental FP8 support for Qwen3.6-27B, optimized for dual RTX 3090 setups.

The open-source project club-3090 has rolled out experimental FP8 quantization support for Qwen3.6-27B. This update is highly anticipated by dual RTX 3090 users, allowing them to run the model with significantly reduced VRAM requirements. According to reports, the official Qwen3.6-27B-FP8 model performs virtually identically to the original unquantized BF16 version.

For enthusiasts and developers passionate about deploying large language models locally (Local LLM), dual RTX 3090s (48GB VRAM total) have always been regarded as a "golden configuration" with extremely high cost-performance. Recently, the open-source project **club-3090**, optimized specifically for this configuration, released an exciting update: it now officially provides experimental **FP8 quantization** support for the **Qwen3.6-27B** model.

Full summary

Free shows the 3-line summary; Pro unlocks the full deep summary (~300 words) so you never have to click through.

See Pro plans →

Want the original English / full article?

Read on r/LocalLLaMA top day →

qwen vllm #fp8 #quantization #multi-gpu #local-llm #rtx-3090

Summaries are AI-generated; the original article is authoritative.