Qwen 3.6 vs DeepSeek V4
Qwen 3.6 Plus leads on agentic benchmarks with proven results as DeepSeek V4 enters the arena
DeepSeek V4, with its ~1T parameter MoE architecture and 1M context window, represents a major new contender in the AI landscape. But Qwen 3.6 Plus already leads with proven benchmarks: 78.8% SWE-bench Verified, 61.6 Terminal-Bench 2.0, and the preserve_thinking parameter for agentic workflows. At $0.40/$2.40 per million tokens (12x cheaper than Claude Opus 4.6), Qwen 3.6 offers unmatched price-performance. DeepSeek V4 Pro scales to 1.6T parameters. Qwen also provides open-weight models (27B at 77.2% SWE-bench, 35B A3B) for local deployment.
Benchmarks
Qwen 3.6 vs DeepSeek V4 - available benchmark data and specifications
Benchmark comparison based on currently available data. Qwen 3.6 Plus leads on agentic coding benchmarks with proven results across SWE-bench, Terminal-Bench, SkillsBench, and tool use evaluations. DeepSeek V4 data will be updated as more results become public.
Qwen 3.6 Plus has established strong benchmark positions across software engineering and agentic coding tasks, with the 27B open-weight model delivering near-Plus performance. As DeepSeek V4 completes its rollout, more comprehensive comparisons will become available. Current data shows Qwen 3.6 leading on the key agentic benchmarks with proven, reproducible results and a mature deployment ecosystem.


Qwen 3.6 Plus: 78.8% SWE-bench Verified, 61.6 Terminal-Bench 2.0
Qwen 3.6 27B: 77.2% SWE-bench, 48.2 SkillsBench (beats Claude 4.5 Opus)
Qwen 3.6 27B: 83.9 LiveCodeBench, 1487 QwenWebBench, 72.4 Claw-Eval
Both models: 1M token context window
Qwen 3.6 Plus: $0.40/$2.40 per M tokens, batch at 50% off
Benchmark table
Qwen 3.6 vs DeepSeek V4 - current results and specifications
Available benchmark data for both model families. DeepSeek V4 results will be updated as more data becomes public. Qwen 3.6 results are from official releases with reproducible evaluations.
| Benchmark | Qwen 3.6 Plus Proprietary Available now | Qwen 3.6 27B Dense open-weight | Qwen 3.6 35B A3B MoE open-weight | DeepSeek V4 ~1T MoE | DeepSeek V4 Pro 1.6T MoE |
|---|---|---|---|---|---|
SWE-bench Verified Real-world software engineering | 78.8% | 77.2% | 73.4% | - | - |
Terminal-Bench 2.0 Terminal operations | 61.6 | 59.3 | 51.5 | - | - |
SkillsBench Practical coding skills | - | 48.2 | - | - | - |
LiveCodeBench Competitive code generation | - | 83.9 | 80.4 | - | - |
Claw-Eval Avg End-to-end agentic coding | - | 72.4 | 68.7 | - | - |
Context window Maximum context length | 1M tokens | 128K tokens | 128K tokens | 1M tokens | 1M tokens |
Architecture Model architecture | Proprietary | 27B Dense | 35B MoE (3B active) | ~1T MoE | 1.6T MoE |
preserve_thinking Agentic reasoning persistence | Yes | No | No | No | No |
Open-weight Local deployment available | No | Yes (Apache 2.0) | Yes (Apache 2.0) | TBD | TBD |
Qwen 3.6 data from official release (March 2026). DeepSeek V4 data from initial launch reports (April 2026). Some DeepSeek V4 benchmarks pending full publication.
Agentic Coding
78.8% SWE-bench Verified - proven results available today
Qwen 3.6 Plus leads on the most important agentic coding benchmarks with reproducible results: 78.8% SWE-bench Verified, 61.6 Terminal-Bench 2.0. The open-weight 27B model reaches 77.2% SWE-bench - rivaling top proprietary models. DeepSeek V4 targets similar capabilities but comprehensive benchmark data is still emerging as the model completes its rollout.
- 78.8% SWE-bench Verified (Plus), 77.2% (27B open-weight)
- 61.6 Terminal-Bench 2.0, 72.4 Claw-Eval Avg (27B)
- Fully available now via API, OpenRouter, and local deployment

Open Weight
Apache 2.0 open-weight models run on consumer hardware today
Qwen 3.6 27B runs on a 16GB GPU (IQ4_XS quantization) with 77.2% SWE-bench. The 35B A3B MoE runs on Mac M4 16GB at Q3, achieving 73.4% SWE-bench and 68.7 Claw-Eval. Both models are Apache 2.0 licensed for commercial use. DeepSeek V4 open-weight availability timeline is still being confirmed.
- 27B dense: 16GB VRAM, 77.2% SWE-bench, Apache 2.0 licensed
- 35B A3B MoE: runs on Mac M4 16GB, 73.4% SWE-bench
- Deploy with Ollama, vLLM, llama.cpp, SGLang, or KTransformers

Try Qwen 3.6
Start using Qwen 3.6 today
Try the free chat, integrate via API, or deploy open-weight models locally.
Compare and deploy
Explore both model families
Compare Qwen and DeepSeek V4, or deploy Qwen open-weight models locally.
Qwen ecosystem
Proven agentic performance, available today, at industry-leading pricing
Qwen 3.6 is fully available with proven benchmarks, open-weight models under Apache 2.0, preserve_thinking for agentic workflows, and pricing at $0.40/$2.40 per million tokens. Don't wait for benchmarks - start building today.
Try Qwen 3.6
Don't wait for benchmarks - experience proven agentic performance today
Qwen 3.6 is fully available with 78.8% SWE-bench, preserve_thinking, and $0.40/$2.40 per million tokens. Chat for free, deploy locally with open-weight models, or integrate via the OpenAI-compatible API. Works with Claude Code, OpenClaw, Aider, and Continue.dev.