Qwen 3.6 27B

27 billion parameters, dense architecture - outperforms its 397B MoE predecessor

Qwen 3.6 27B is a dense model built on the Hybrid Gated DeltaNet architecture with 64 layers and 262K native context. It scores 77.2% on SWE-bench Verified, surpassing the 397B MoE Qwen 3 at 76.2%, while fitting in ~55.6GB VRAM at FP16 or ~18GB with quantization.

Start Chatting View benchmarks

Model variants

Dense architecture, maximum quality per parameter

Qwen 3.6 27B delivers frontier-class performance in a dense 27B form factor. Choose the instruction-tuned variant for chat and agentic tasks, or the base model for fine-tuning.

Hybrid Gated DeltaNet Architecture

27B dense parameters, 64 layers, hidden dimension 5120

Qwen 3.6 27B uses a Hybrid Gated DeltaNet design that combines linear attention efficiency with gated recurrence. The 262K native context window is extensible to 1M tokens, making it ideal for long-document analysis and complex agentic workflows.

With qwen 3.6 27b vram requirements of ~55.6GB at FP16 or ~18GB quantized, this model runs on a single high-end GPU or dual consumer GPUs with the qwen 3.6 27b gguf format.

Start Chatting See capabilities

Instruction-tuned

27B Instruct

Optimized for conversational AI, coding, and complex agentic tasks

Fine-tuned for instruction following, multi-turn dialogue, and tool use via the qwen 3.6 27b api

Available now

Start Chatting Download weights

Pre-trained

27B Base

Foundation dense model for fine-tuning and specialized applications

Pre-trained on diverse data with Hybrid Gated DeltaNet architecture for maximum quality

Available now

View on HuggingFace Fine-tuning guide

Capabilities

Dense powerhouse that punches above its weight class

Qwen 3.6 27B combines the Hybrid Gated DeltaNet architecture with 262K context to deliver performance that surpasses models 14x its size on real-world coding benchmarks.

Elite software engineering

77.2% on SWE-bench Verified - beating the 397B MoE Qwen 3 (76.2%). The qwen 3.6 27b benchmark results prove dense architectures can match frontier-scale models on real-world coding.

Terminal mastery

59.3 on Terminal-Bench 2.0, matching Claude 4.5 Opus. Handles complex multi-step terminal workflows, debugging sessions, and system administration tasks with expert-level proficiency.

Advanced reasoning

94.1% on AIME 2026 mathematics and 86.2 on MMLU-Pro knowledge reasoning. Step-by-step thinking mode enables transparent problem solving across math, logic, and science.

262K to 1M context

262K native context window extensible to 1M tokens. Process entire codebases, long research papers, and multi-turn conversations without losing coherence.

Competitive coding

83.9 on LiveCodeBench v6 for competitive programming. Excels at algorithmic problem solving, code generation, and complex debugging tasks.

Practical skill execution

48.2 on SkillsBench, surpassing Claude 4.5 Opus (45.3). Demonstrates superior ability to follow complex instructions and execute multi-step real-world tasks.

Key highlights

Exceptional qwen 3.6 27b benchmark results

Qwen 3.6 27B achieves frontier-class results across coding, reasoning, and agentic benchmarks while maintaining efficient dense inference.

Top achievements

SWE-bench Verified: 77.2% - beats 397B MoE predecessor (76.2%)
Terminal-Bench 2.0: 59.3 - matches Claude 4.5 Opus
SkillsBench: 48.2 - beats Claude 4.5 Opus (45.3)
AIME 2026: 94.1% mathematics
LiveCodeBench v6: 83.9 competitive coding

Technical specs

27B dense parameters, 64 layers, hidden dimension 5120
Hybrid Gated DeltaNet architecture
262K native context, extensible to 1M tokens
qwen 3.6 27b vram: ~55.6GB FP16, ~18GB quantized
Available in qwen 3.6 27b gguf format for local deployment

Start Free Chat Download weights

Performance

Dense 27B that outperforms 397B MoE on real-world coding

Qwen 3.6 27B scores 77.2% on SWE-bench Verified and 94.1% on AIME 2026, proving that a well-architected dense model can match or exceed models many times its size.

The qwen 3.6 27b benchmark suite demonstrates consistent excellence across software engineering, terminal operations, mathematics, and competitive coding - rivaling or surpassing models with 10x+ more parameters.

Start Chatting View model card

Qwen 3.6 27B performance comparison chart across coding and reasoning benchmarks

SWE-bench Verified: 77.2% - surpasses 397B MoE Qwen 3 (76.2%)

Terminal-Bench 2.0: 59.3 - matches Claude 4.5 Opus

SkillsBench: 48.2 - beats Claude 4.5 Opus at 45.3

AIME 2026: 94.1% on advanced mathematics

MMLU-Pro: 86.2 across diverse knowledge domains

Benchmark comparison

Qwen 3.6 27B vs frontier models

Qwen 3.6 27B delivers frontier-class performance across software engineering, terminal operations, reasoning, and coding benchmarks. Access results via the qwen 3.6 27b api.

Benchmark	Qwen 3.6 27B Dense Featured	Qwen 3 235B A22B MoE	Claude 4.5 Opus Proprietary	Qwen 3.6 35B A3B MoE
SWE-bench Verified Real-world software engineering	77.2%	76.2%	-	73.4%
Terminal-Bench 2.0 Terminal operations	59.3	-	59.3	51.5
SkillsBench Real-world task execution	48.2	-	45.3	-
AIME 2026 Mathematics No tools	94.1%	-	-	92.7%
LiveCodeBench v6 Competitive coding	83.9	-	-	80.4
MMLU-Pro Knowledge & reasoning	86.2	-	-	-

Benchmark results from official Qwen 3.6 model card and HuggingFace evaluations.

Hybrid Gated DeltaNet

A new architecture that redefines dense model efficiency

The Hybrid Gated DeltaNet architecture combines linear attention with gated recurrence across 64 layers and a hidden dimension of 5120. This design enables 262K native context extensible to 1M tokens while maintaining the inference simplicity of a dense model.

64 layers with hidden dimension 5120 for deep representation learning
262K native context window, extensible to 1M tokens
qwen 3.6 27b vram: ~55.6GB FP16, ~18GB with quantization (qwen 3.6 27b gguf)

Start Chatting View architecture details

A new architecture that redefines dense model efficiency

Software Engineering

77.2% SWE-bench Verified - the dense model that beat a 397B MoE

Qwen 3.6 27B achieves 77.2% on SWE-bench Verified, surpassing its 397B MoE predecessor at 76.2%. Combined with 59.3 on Terminal-Bench 2.0 (matching Claude 4.5 Opus) and 83.9 on LiveCodeBench v6, it's a complete software engineering assistant accessible through the qwen 3.6 27b api.