Qwen 3.6 27B vs 35B A3B
Dense power vs MoE efficiency - choose the open-weight model that fits your hardware and workflow
Qwen 3.6 27B is a dense model with all 27B parameters active, delivering the best open-weight coding performance: 77.2% SWE-bench, 48.2 SkillsBench (beats Claude 4.5 Opus at 45.3), 83.9 LiveCodeBench. It can run on 16GB VRAM using IQ4_XS GGUF with KV cache compression. The 35B A3B uses Mixture-of-Experts with only 3B active parameters, achieving 73.4% SWE-bench while running on Mac M4 16GB at Q3 quantization with 20-40 tok/s on consumer hardware. Same family, same Apache 2.0 license, different tradeoffs.
Benchmarks
Qwen 3.6 27B vs 35B A3B - detailed benchmark and hardware comparison
Comprehensive benchmark results showing the performance gap between the dense and MoE variants across software engineering, coding, terminal operations, mathematical reasoning, frontend generation, and practical coding skills.
The 27B dense model consistently outperforms the 35B A3B MoE variant across all benchmarks, but the gap is moderate and predictable. The 35B A3B achieves approximately 95% of the 27B's quality while requiring only ~40% of the compute and running 2-3x faster. The 27B's SkillsBench score of 48.2 (beating Claude 4.5 Opus at 45.3) highlights its strength in practical engineering judgment. For Mac M4 16GB users, the 35B A3B at Q3 is the clear choice. For workstation users prioritizing quality, the 27B with IQ4_XS fits 16GB VRAM.


SWE-bench Verified: 77.2% (27B) vs 73.4% (35B A3B)
SkillsBench: 48.2 (27B) beats Claude 4.5 Opus (45.3)
Terminal-Bench 2.0: 59.3 (27B) vs 51.5 (35B A3B)
35B A3B: 20-40 tok/s on consumer hardware, Mac M4 16GB confirmed
27B IQ4_XS: fits 16GB VRAM with KV cache compression (100K context)
Benchmark table
27B Dense vs 35B A3B MoE - full results with hardware specs
Side-by-side benchmark comparison with hardware requirements, inference speed, and efficiency metrics for both open-weight models.
| Benchmark | Qwen 3.6 27B Dense (all params active) Best quality | Qwen 3.6 35B A3B MoE (3B active) Best efficiency |
|---|---|---|
SWE-bench Verified Real-world software engineering | 77.2% | 73.4% |
Terminal-Bench 2.0 Terminal operations | 59.3 | 51.5 |
SkillsBench Practical coding skills (Claude 4.5 Opus: 45.3) | 48.2 | - |
AIME 2025 Competition mathematics | 94.1% | 92.7% |
LiveCodeBench Competitive code generation | 83.9 | 80.4 |
QwenWebBench Frontend code generation | 1487 | 1397 |
Claw-Eval Avg End-to-end agentic coding | 72.4 | 68.7 |
NL2Repo Natural language to repository | 36.2 | - |
Model size (FP16) Full precision weight size | 55.6 GB | ~70 GB total |
Minimum VRAM (quantized) Smallest working configuration | 16 GB (IQ4_XS + KV cache) | ~17 GB (Q3_K_M) |
Recommended VRAM Comfortable operation with context | 24 GB (Q4_K_M) | 24 GB (Q4_K_M) |
Active parameters Parameters computed per token | 27B (all) | 3B (of 35B) |
Inference speed (4-bit) Consumer hardware tok/s | ~10-15 tok/s | 20-40 tok/s |
Mac M4 16GB Apple Silicon laptop | IQ4_XS (tight) | Q3_K_M (confirmed) |
Benchmark data from HuggingFace model cards and official Qwen 3.6 release. Hardware benchmarks from Unsloth community testing. SkillsBench reference: Claude 4.5 Opus scores 45.3.
Qwen 3.6 27B
27B dense model: 16GB VRAM, 77.2% SWE-bench, peak performance
Qwen 3.6 27B is a dense transformer where all 27 billion parameters are active on every token. It runs on a single 16GB GPU (IQ4_XS quantization) or Mac M4 Pro with enough RAM, achieving 77.2% SWE-bench Verified and 48.2 SkillsBench - beating Claude 4.5 Opus. Best choice when you have 16GB+ VRAM and want the absolute highest benchmark scores.
- 77.2% SWE-bench Verified, 48.2 SkillsBench (beats Claude 4.5 Opus)
- 59.3 Terminal-Bench 2.0, 72.4 Claw-Eval Avg
- Runs on 16GB GPU with IQ4_XS, or Mac M4 Pro with 32GB RAM

Qwen 3.6 35B A3B
35B A3B MoE: runs on Mac M4 16GB with strong coding performance
Qwen 3.6 35B A3B uses a Mixture-of-Experts architecture with 35 billion total parameters but only 3 billion active per token - making it fast and memory-efficient. It runs on Mac M4 16GB at Q3 quantization, achieving 73.4% SWE-bench Verified and 68.7 Claw-Eval. Best choice for Mac users or when GPU memory is limited but you want strong coding ability.
- 73.4% SWE-bench Verified, 68.7 Claw-Eval Avg
- 51.5 Terminal-Bench 2.0, 80.4 LiveCodeBench
- MoE: 35B total, 3B active - runs on Mac M4 16GB at Q3

Download models
Get both Qwen 3.6 open-weight models
Download from HuggingFace or pull via Ollama for immediate local deployment.
Deploy and use
Run models and build applications
Deploy locally, try in the hosted chat, or read the full documentation.
Qwen ecosystem
Two open-weight models for every deployment scenario - Apache 2.0 licensed
Whether you prioritize maximum quality (27B, 77.2% SWE-bench) or hardware efficiency (35B A3B, 20-40 tok/s on consumer GPU), the Qwen 3.6 open-weight family has the right model. Both under Apache 2.0 with vision, multimodal, and tool calling support.
Get started
Ready to choose your Qwen 3.6 open-weight model? Try both for free
Try both models in the browser, then download the one that fits your hardware. 27B for maximum quality (77.2% SWE-bench, beats Claude on SkillsBench), 35B A3B for consumer GPU deployment (20-40 tok/s, Mac M4 16GB confirmed). Both Apache 2.0 licensed.