Qwen3.5 27B: Dense Vision-Language Reasoning Model
Qwen3.5 27B is a dense multimodal foundation model from Alibaba's Qwen team, built on a hybrid Gated DeltaNet and Gated Attention architecture. With 27 billion parameters, it pairs strong text reasoning with native vision understanding through early fusion multimodal training, delivering competitive benchmark performance against much larger models while remaining practical to serve on single-node hardware.
Key Features
- Unified Vision-Language Foundation - Early fusion training on multimodal tokens achieves cross-generational parity with Qwen3 and outperforms Qwen3-VL models across reasoning, coding, agents, and visual understanding benchmarks
- Efficient Hybrid Architecture - Gated Delta Networks combined with Gated Attention deliver high-throughput inference with minimal latency overhead
- Scalable RL Generalization - Reinforcement learning scaled across million-agent environments with progressively complex task distributions for robust real-world adaptability
- Global Linguistic Coverage - Expanded support to 201 languages and dialects for inclusive worldwide deployment
- Long Context - 262,144 tokens natively, extensible up to 1,010,000 tokens with YaRN
Architecture
- Causal Language Model with Vision Encoder
- 27B dense parameters
- 64 layers with a 16 × (3 × (Gated DeltaNet → FFN) → 1 × (Gated Attention → FFN)) hybrid layout
- Gated DeltaNet linear attention (48 V heads, 16 QK heads, head dim 128)
- Gated Attention (24 Q heads, 4 KV heads, head dim 256)
- Feed Forward Network intermediate dimension 17408
- Multi-token prediction (MTP) trained with multi-steps
- Native 262K context, extensible to 1M tokens
Use Cases
- Multimodal reasoning and visual question answering
- Document, chart, and diagram understanding
- Coding and software engineering agents
- Tool-using agent workflows across long horizons
- Multilingual chat and instruction following across 201 languages
- Long-context analysis and retrieval over large document sets
Benchmarks
On the Qwen3.5 benchmark suite (source), Qwen3.5 27B scores MMLU-Pro 86.1, MMLU-Redux 93.2, C-Eval 90.5, SuperGPQA 65.6, IFEval 95.0, GPQA Diamond 85.5, and LongBench v2 60.6 — outperforming the larger Qwen3-235B-A22B on several of these metrics while activating every parameter densely.