Qwen 3.6 27B vs 35B A3B

Dense power vs MoE efficiency - choose the open-weight model that fits your hardware and workflow

Qwen 3.6 27B is a dense model with all 27B parameters active, delivering the best open-weight coding performance: 77.2% SWE-bench, 48.2 SkillsBench (beats Claude 4.5 Opus at 45.3), 83.9 LiveCodeBench. It can run on 16GB VRAM using IQ4_XS GGUF with KV cache compression. The 35B A3B uses Mixture-of-Experts with only 3B active parameters, achieving 73.4% SWE-bench while running on Mac M4 16GB at Q3 quantization with 20-40 tok/s on consumer hardware. Same family, same Apache 2.0 license, different tradeoffs.

Start Chatting View benchmarks

Benchmarks

Qwen 3.6 27B vs 35B A3B - detailed benchmark and hardware comparison

Comprehensive benchmark results showing the performance gap between the dense and MoE variants across software engineering, coding, terminal operations, mathematical reasoning, frontend generation, and practical coding skills.

The 27B dense model consistently outperforms the 35B A3B MoE variant across all benchmarks, but the gap is moderate and predictable. The 35B A3B achieves approximately 95% of the 27B's quality while requiring only ~40% of the compute and running 2-3x faster. The 27B's SkillsBench score of 48.2 (beating Claude 4.5 Opus at 45.3) highlights its strength in practical engineering judgment. For Mac M4 16GB users, the 35B A3B at Q3 is the clear choice. For workstation users prioritizing quality, the 27B with IQ4_XS fits 16GB VRAM.

Start Chatting Download models

Benchmark comparison chart showing Qwen 3.6 27B vs 35B A3B performance across SWE-bench, Terminal-Bench, AIME, LiveCodeBench, SkillsBench, QwenWebBench, and Claw-Eval

SWE-bench Verified: 77.2% (27B) vs 73.4% (35B A3B)

SkillsBench: 48.2 (27B) beats Claude 4.5 Opus (45.3)

Terminal-Bench 2.0: 59.3 (27B) vs 51.5 (35B A3B)

35B A3B: 20-40 tok/s on consumer hardware, Mac M4 16GB confirmed

27B IQ4_XS: fits 16GB VRAM with KV cache compression (100K context)

Benchmark table

27B Dense vs 35B A3B MoE - full results with hardware specs

Side-by-side benchmark comparison with hardware requirements, inference speed, and efficiency metrics for both open-weight models.

Benchmark	Qwen 3.6 27B Dense (all params active) Best quality	Qwen 3.6 35B A3B MoE (3B active) Best efficiency
SWE-bench Verified Real-world software engineering	77.2%	73.4%
Terminal-Bench 2.0 Terminal operations	59.3	51.5
SkillsBench Practical coding skills (Claude 4.5 Opus: 45.3)	48.2	-
AIME 2025 Competition mathematics	94.1%	92.7%
LiveCodeBench Competitive code generation	83.9	80.4
QwenWebBench Frontend code generation	1487	1397
Claw-Eval Avg End-to-end agentic coding	72.4	68.7
NL2Repo Natural language to repository	36.2	-
Model size (FP16) Full precision weight size	55.6 GB	~70 GB total
Minimum VRAM (quantized) Smallest working configuration	16 GB (IQ4_XS + KV cache)	~17 GB (Q3_K_M)
Recommended VRAM Comfortable operation with context	24 GB (Q4_K_M)	24 GB (Q4_K_M)
Active parameters Parameters computed per token	27B (all)	3B (of 35B)
Inference speed (4-bit) Consumer hardware tok/s	~10-15 tok/s	20-40 tok/s
Mac M4 16GB Apple Silicon laptop	IQ4_XS (tight)	Q3_K_M (confirmed)

Benchmark data from HuggingFace model cards and official Qwen 3.6 release. Hardware benchmarks from Unsloth community testing. SkillsBench reference: Claude 4.5 Opus scores 45.3.

Qwen 3.6 27B

27B dense model: 16GB VRAM, 77.2% SWE-bench, peak performance

Qwen 3.6 27B is a dense transformer where all 27 billion parameters are active on every token. It runs on a single 16GB GPU (IQ4_XS quantization) or Mac M4 Pro with enough RAM, achieving 77.2% SWE-bench Verified and 48.2 SkillsBench - beating Claude 4.5 Opus. Best choice when you have 16GB+ VRAM and want the absolute highest benchmark scores.

77.2% SWE-bench Verified, 48.2 SkillsBench (beats Claude 4.5 Opus)
59.3 Terminal-Bench 2.0, 72.4 Claw-Eval Avg
Runs on 16GB GPU with IQ4_XS, or Mac M4 Pro with 32GB RAM

Download 27B View benchmarks

Qwen 3.6 35B A3B

35B A3B MoE: runs on Mac M4 16GB with strong coding performance

Qwen 3.6 35B A3B uses a Mixture-of-Experts architecture with 35 billion total parameters but only 3 billion active per token - making it fast and memory-efficient. It runs on Mac M4 16GB at Q3 quantization, achieving 73.4% SWE-bench Verified and 68.7 Claw-Eval. Best choice for Mac users or when GPU memory is limited but you want strong coding ability.

73.4% SWE-bench Verified, 68.7 Claw-Eval Avg
51.5 Terminal-Bench 2.0, 80.4 LiveCodeBench
MoE: 35B total, 3B active - runs on Mac M4 16GB at Q3

Download 35B A3B Mac setup guide

Download models

Get both Qwen 3.6 open-weight models

Download from HuggingFace or pull via Ollama for immediate local deployment.

Qwen 3.6 27B on HuggingFace

Download dense model weights - Apache 2.0 licensed

Qwen 3.6 27B on Ollama

Pull and run instantly with Ollama

Qwen 3.6 35B A3B on HuggingFace

Download MoE model weights - Apache 2.0 licensed

Qwen 3.6 35B A3B on Ollama

Pull and run on Mac M4 or GPU with Ollama

Deploy and use

Run models and build applications

Deploy locally, try in the hosted chat, or read the full documentation.

Try hosted chat

Chat with Qwen 3.6 instantly without local setup

Deployment guide

Full guide for local deployment with quantization tips

Local setup overview

Compare deployment options: Ollama, vLLM, llama.cpp

Qwen ecosystem

Two open-weight models for every deployment scenario - Apache 2.0 licensed

Whether you prioritize maximum quality (27B, 77.2% SWE-bench) or hardware efficiency (35B A3B, 20-40 tok/s on consumer GPU), the Qwen 3.6 open-weight family has the right model. Both under Apache 2.0 with vision, multimodal, and tool calling support.

Explore all models HuggingFace collection

Qwen 3.6 27B

Dense, 77.2% SWE-bench, 48.2 SkillsBench

Download

Qwen 3.6 35B A3B

MoE, 73.4% SWE-bench, Mac M4 16GB

Download

Ollama setup

One-command local deployment for both models

Get started

GGUF models

Quantized models for every VRAM budget

Browse

vLLM serving

Production deployment with continuous batching

Read docs

Community

Get help choosing the right model

Join

Get started

Ready to choose your Qwen 3.6 open-weight model? Try both for free

Try both models in the browser, then download the one that fits your hardware. 27B for maximum quality (77.2% SWE-bench, beats Claude on SkillsBench), 35B A3B for consumer GPU deployment (20-40 tok/s, Mac M4 16GB confirmed). Both Apache 2.0 licensed.

Start Chatting Download models