Qwen 3.6 Models
Four models, one family - from Mac M4 16GB to frontier performance at $0.40/M tokens
The Qwen 3.6 family spans proprietary hosted models and open-weight releases. Plus delivers 78.8% SWE-bench with 1M context and preserve_thinking at $0.40/$2.40 per million tokens (12x cheaper than Claude Opus 4.6). Max handles advanced reasoning and multi-modal tasks. The 27B dense model achieves 77.2% SWE-bench and 48.2 SkillsBench (beating Claude 4.5 Opus). The 35B A3B MoE runs on Mac M4 16GB at Q3 quantization with 20-40 tok/s. Every model shares the same architecture foundation and OpenAI-compatible API.
Model family
Choose the right Qwen 3.6 model for your use case and budget
From lightweight local deployment on a laptop to maximum hosted performance with 1M context, the Qwen 3.6 family covers every scale, budget, and deployment scenario.
Proprietary
Hosted models with maximum performance and 1M context
Plus and Max are proprietary models available via API. They offer the highest performance, 1M context windows, up to 65,536 output tokens, and features like preserve_thinking that are exclusive to the hosted tier. DashScope pricing starts at $0.40 per million input tokens - roughly 12x cheaper than Claude Opus 4.6. Batch invocation available at 50% of real-time pricing.
Available via OpenAI-compatible API through DashScope and OpenRouter (free tier available)
Open-weight
Run on your own hardware with full control - Apache 2.0 licensed
The 27B dense and 35B A3B MoE models are released under the Apache 2.0 license. Deploy locally with Ollama, vLLM, llama.cpp, SGLang, or KTransformers. The 27B fits 16GB VRAM with IQ4_XS GGUF and KV cache compression. The 35B A3B runs on Mac M4 16GB at Q3 quantization. Zero per-token costs, full data privacy, and freedom to fine-tune.
Available on HuggingFace, Ollama, and GGUF repositories
Qwen 3.6 Plus
Proprietary
Flagship proprietary model with 1M context, preserve_thinking for agentic workflows, and top-tier performance. 78.8% SWE-bench Verified, 61.6 Terminal-Bench 2.0, 56.6 SWE-bench Pro. DashScope pricing: $0.40 input / $2.40 output per million tokens, roughly 12x cheaper than Claude Opus 4.6. Batch invocation at 50% off. Up to 65,536 output tokens per request.
1M context window, preserve_thinking parameter, 65K output tokens, batch at 50% off
Qwen 3.6 Max
Proprietary
High-performance proprietary model optimized for complex reasoning, multi-modal tasks, and document understanding. Strong across math, science, visual analysis, and long-document processing. Extended context window with advanced reasoning capabilities for the most demanding analytical tasks.
Extended context, multi-modal capabilities, advanced reasoning, document understanding
Qwen 3.6 27B
Open-weight
Dense 27B parameter model delivering the best open-weight coding performance. 77.2% SWE-bench Verified, 59.3 Terminal-Bench 2.0, 83.9 LiveCodeBench, 48.2 SkillsBench (beats Claude 4.5 Opus at 45.3), 1487 QwenWebBench, 36.2 NL2Repo, 72.4 Claw-Eval. Can run on 16GB VRAM using IQ4_XS GGUF with KV cache compression supporting 100K context.
55.6GB FP16, 16GB VRAM with IQ4_XS + KV cache compression, dense architecture, Apache 2.0
Qwen 3.6 35B A3B
Open-weight
MoE model with 35B total / 3B active parameters. Near-27B performance in a consumer GPU footprint. 73.4% SWE-bench Verified, 51.5 Terminal-Bench 2.0, 80.4 LiveCodeBench, 68.7 Claw-Eval, 1397 QwenWebBench. Runs on Mac M4 16GB at Q3 quantization (~17GB). 20-40 tok/s on consumer hardware at 4-bit. Vision and multimodal supported.
~21GB Q4_K_M, ~17GB Q3_K_M (Mac M4 16GB), 3B active params, 20-40 tok/s, Apache 2.0
Qwen ecosystem
A unified model family for every deployment scenario and budget
From cloud API at $0.40/M tokens to Mac M4 laptop deployment, the Qwen 3.6 family provides consistent quality, compatible interfaces, and industry-leading price-performance across all deployment targets.
Get started
Ready to explore the Qwen 3.6 family? Try free, deploy anywhere
Try any Qwen 3.6 model for free in the browser or via OpenRouter's free tier. Download open-weight models under Apache 2.0 to run on your own hardware. From Mac M4 16GB to production servers, from $0.40/M tokens API to zero-cost local deployment.