Model Library/Qwen3.5 27B

Alibaba logoQwen3.5 27B

LLM
Reasoning
Vision Language

Dense 27B vision-language model with unified multimodal reasoning

On-Demand Dedicated 1xRTX PRO 6000 S

Details

Modalities

text, vision

Version

3.5 27B

Recommended Hardware

1xRTX PRO 6000 S

Estimated Price

Loading...

Provider

Alibaba

Family

Qwen

Parameters

27B

Context

262144 tokens

License

apache-2.0

Qwen3.5 27B: Dense Vision-Language Reasoning Model

Qwen3.5 27B is a dense multimodal foundation model from Alibaba's Qwen team, built on a hybrid Gated DeltaNet and Gated Attention architecture. With 27 billion parameters, it pairs strong text reasoning with native vision understanding through early fusion multimodal training, delivering competitive benchmark performance against much larger models while remaining practical to serve on single-node hardware.

Key Features

  • Unified Vision-Language Foundation - Early fusion training on multimodal tokens achieves cross-generational parity with Qwen3 and outperforms Qwen3-VL models across reasoning, coding, agents, and visual understanding benchmarks
  • Efficient Hybrid Architecture - Gated Delta Networks combined with Gated Attention deliver high-throughput inference with minimal latency overhead
  • Scalable RL Generalization - Reinforcement learning scaled across million-agent environments with progressively complex task distributions for robust real-world adaptability
  • Global Linguistic Coverage - Expanded support to 201 languages and dialects for inclusive worldwide deployment
  • Long Context - 262,144 tokens natively, extensible up to 1,010,000 tokens with YaRN

Architecture

  • Causal Language Model with Vision Encoder
  • 27B dense parameters
  • 64 layers with a 16 × (3 × (Gated DeltaNet → FFN) → 1 × (Gated Attention → FFN)) hybrid layout
  • Gated DeltaNet linear attention (48 V heads, 16 QK heads, head dim 128)
  • Gated Attention (24 Q heads, 4 KV heads, head dim 256)
  • Feed Forward Network intermediate dimension 17408
  • Multi-token prediction (MTP) trained with multi-steps
  • Native 262K context, extensible to 1M tokens

Use Cases

  • Multimodal reasoning and visual question answering
  • Document, chart, and diagram understanding
  • Coding and software engineering agents
  • Tool-using agent workflows across long horizons
  • Multilingual chat and instruction following across 201 languages
  • Long-context analysis and retrieval over large document sets

Benchmarks

On the Qwen3.5 benchmark suite (source), Qwen3.5 27B scores MMLU-Pro 86.1, MMLU-Redux 93.2, C-Eval 90.5, SuperGPQA 65.6, IFEval 95.0, GPQA Diamond 85.5, and LongBench v2 60.6 — outperforming the larger Qwen3-235B-A22B on several of these metrics while activating every parameter densely.

Quick Start Guide

Choose a model and click 'Deploy' above to find available GPUs recommended for this model.

Rent your dedicated instance preconfigured with the model you've selected.

Start sending requests to your model instance and getting responses right now.