Qwen 3.6 27B vs 35B A3B

Dense power vs MoE efficiency - choose the open-weight model that fits your hardware and workflow

Qwen 3.6 27B is a dense model with all 27B parameters active, delivering the best open-weight coding performance: 77.2% SWE-bench, 48.2 SkillsBench (beats Claude 4.5 Opus at 45.3), 83.9 LiveCodeBench. It can run on 16GB VRAM using IQ4_XS GGUF with KV cache compression. The 35B A3B uses Mixture-of-Experts with only 3B active parameters, achieving 73.4% SWE-bench while running on Mac M4 16GB at Q3 quantization with 20-40 tok/s on consumer hardware. Same family, same Apache 2.0 license, different tradeoffs.

Benchmarks

Qwen 3.6 27B vs 35B A3B - detailed benchmark and hardware comparison

Comprehensive benchmark results showing the performance gap between the dense and MoE variants across software engineering, coding, terminal operations, mathematical reasoning, frontend generation, and practical coding skills.

The 27B dense model consistently outperforms the 35B A3B MoE variant across all benchmarks, but the gap is moderate and predictable. The 35B A3B achieves approximately 95% of the 27B's quality while requiring only ~40% of the compute and running 2-3x faster. The 27B's SkillsBench score of 48.2 (beating Claude 4.5 Opus at 45.3) highlights its strength in practical engineering judgment. For Mac M4 16GB users, the 35B A3B at Q3 is the clear choice. For workstation users prioritizing quality, the 27B with IQ4_XS fits 16GB VRAM.

Benchmark comparison chart showing Qwen 3.6 27B vs 35B A3B performance across SWE-bench, Terminal-Bench, AIME, LiveCodeBench, SkillsBench, QwenWebBench, and Claw-Eval

SWE-bench Verified: 77.2% (27B) vs 73.4% (35B A3B)

SkillsBench: 48.2 (27B) beats Claude 4.5 Opus (45.3)

Terminal-Bench 2.0: 59.3 (27B) vs 51.5 (35B A3B)

35B A3B: 20-40 tok/s on consumer hardware, Mac M4 16GB confirmed

27B IQ4_XS: fits 16GB VRAM with KV cache compression (100K context)

Benchmark table

27B Dense vs 35B A3B MoE - full results with hardware specs

Side-by-side benchmark comparison with hardware requirements, inference speed, and efficiency metrics for both open-weight models.

Benchmark
Qwen 3.6 27B
Dense (all params active)
Best quality
Qwen 3.6 35B A3B
MoE (3B active)
Best efficiency
SWE-bench Verified
Real-world software engineering
77.2%73.4%
Terminal-Bench 2.0
Terminal operations
59.351.5
SkillsBench
Practical coding skills (Claude 4.5 Opus: 45.3)
48.2-
AIME 2025
Competition mathematics
94.1%92.7%
LiveCodeBench
Competitive code generation
83.980.4
QwenWebBench
Frontend code generation
14871397
Claw-Eval Avg
End-to-end agentic coding
72.468.7
NL2Repo
Natural language to repository
36.2-
Model size (FP16)
Full precision weight size
55.6 GB~70 GB total
Minimum VRAM (quantized)
Smallest working configuration
16 GB (IQ4_XS + KV cache)~17 GB (Q3_K_M)
Recommended VRAM
Comfortable operation with context
24 GB (Q4_K_M)24 GB (Q4_K_M)
Active parameters
Parameters computed per token
27B (all)3B (of 35B)
Inference speed (4-bit)
Consumer hardware tok/s
~10-15 tok/s20-40 tok/s
Mac M4 16GB
Apple Silicon laptop
IQ4_XS (tight)Q3_K_M (confirmed)

Benchmark data from HuggingFace model cards and official Qwen 3.6 release. Hardware benchmarks from Unsloth community testing. SkillsBench reference: Claude 4.5 Opus scores 45.3.

Qwen 3.6 27B

27B dense model: 16GB VRAM, 77.2% SWE-bench, peak performance

Qwen 3.6 27B is a dense transformer where all 27 billion parameters are active on every token. It runs on a single 16GB GPU (IQ4_XS quantization) or Mac M4 Pro with enough RAM, achieving 77.2% SWE-bench Verified and 48.2 SkillsBench - beating Claude 4.5 Opus. Best choice when you have 16GB+ VRAM and want the absolute highest benchmark scores.

  • 77.2% SWE-bench Verified, 48.2 SkillsBench (beats Claude 4.5 Opus)
  • 59.3 Terminal-Bench 2.0, 72.4 Claw-Eval Avg
  • Runs on 16GB GPU with IQ4_XS, or Mac M4 Pro with 32GB RAM
27B dense model: 16GB VRAM, 77.2% SWE-bench, peak performance

Qwen 3.6 35B A3B

35B A3B MoE: runs on Mac M4 16GB with strong coding performance

Qwen 3.6 35B A3B uses a Mixture-of-Experts architecture with 35 billion total parameters but only 3 billion active per token - making it fast and memory-efficient. It runs on Mac M4 16GB at Q3 quantization, achieving 73.4% SWE-bench Verified and 68.7 Claw-Eval. Best choice for Mac users or when GPU memory is limited but you want strong coding ability.

  • 73.4% SWE-bench Verified, 68.7 Claw-Eval Avg
  • 51.5 Terminal-Bench 2.0, 80.4 LiveCodeBench
  • MoE: 35B total, 3B active - runs on Mac M4 16GB at Q3
35B A3B MoE: runs on Mac M4 16GB with strong coding performance

Qwen ecosystem

Two open-weight models for every deployment scenario - Apache 2.0 licensed

Whether you prioritize maximum quality (27B, 77.2% SWE-bench) or hardware efficiency (35B A3B, 20-40 tok/s on consumer GPU), the Qwen 3.6 open-weight family has the right model. Both under Apache 2.0 with vision, multimodal, and tool calling support.

Qwen 3.6 27B

Dense, 77.2% SWE-bench, 48.2 SkillsBench

Download

Qwen 3.6 35B A3B

MoE, 73.4% SWE-bench, Mac M4 16GB

Download

Ollama setup

One-command local deployment for both models

Get started

GGUF models

Quantized models for every VRAM budget

Browse

vLLM serving

Production deployment with continuous batching

Read docs

Community

Get help choosing the right model

Join

Get started

Ready to choose your Qwen 3.6 open-weight model? Try both for free

Try both models in the browser, then download the one that fits your hardware. 27B for maximum quality (77.2% SWE-bench, beats Claude on SkillsBench), 35B A3B for consumer GPU deployment (20-40 tok/s, Mac M4 16GB confirmed). Both Apache 2.0 licensed.