Qwen 3.6 vs DeepSeek V4

Qwen 3.6 Plus leads on agentic benchmarks with proven results as DeepSeek V4 enters the arena

DeepSeek V4, with its ~1T parameter MoE architecture and 1M context window, represents a major new contender in the AI landscape. But Qwen 3.6 Plus already leads with proven benchmarks: 78.8% SWE-bench Verified, 61.6 Terminal-Bench 2.0, and the preserve_thinking parameter for agentic workflows. At $0.40/$2.40 per million tokens (12x cheaper than Claude Opus 4.6), Qwen 3.6 offers unmatched price-performance. DeepSeek V4 Pro scales to 1.6T parameters. Qwen also provides open-weight models (27B at 77.2% SWE-bench, 35B A3B) for local deployment.

Benchmarks

Qwen 3.6 vs DeepSeek V4 - available benchmark data and specifications

Benchmark comparison based on currently available data. Qwen 3.6 Plus leads on agentic coding benchmarks with proven results across SWE-bench, Terminal-Bench, SkillsBench, and tool use evaluations. DeepSeek V4 data will be updated as more results become public.

Qwen 3.6 Plus has established strong benchmark positions across software engineering and agentic coding tasks, with the 27B open-weight model delivering near-Plus performance. As DeepSeek V4 completes its rollout, more comprehensive comparisons will become available. Current data shows Qwen 3.6 leading on the key agentic benchmarks with proven, reproducible results and a mature deployment ecosystem.

Benchmark comparison chart showing Qwen 3.6 vs DeepSeek V4 performance on available benchmarks including SWE-bench, Terminal-Bench, and SkillsBench

Qwen 3.6 Plus: 78.8% SWE-bench Verified, 61.6 Terminal-Bench 2.0

Qwen 3.6 27B: 77.2% SWE-bench, 48.2 SkillsBench (beats Claude 4.5 Opus)

Qwen 3.6 27B: 83.9 LiveCodeBench, 1487 QwenWebBench, 72.4 Claw-Eval

Both models: 1M token context window

Qwen 3.6 Plus: $0.40/$2.40 per M tokens, batch at 50% off

Benchmark table

Qwen 3.6 vs DeepSeek V4 - current results and specifications

Available benchmark data for both model families. DeepSeek V4 results will be updated as more data becomes public. Qwen 3.6 results are from official releases with reproducible evaluations.

Benchmark
Qwen 3.6 Plus
Proprietary
Available now
Qwen 3.6 27B
Dense open-weight
Qwen 3.6 35B A3B
MoE open-weight
DeepSeek V4
~1T MoE
DeepSeek V4 Pro
1.6T MoE
SWE-bench Verified
Real-world software engineering
78.8%77.2%73.4%--
Terminal-Bench 2.0
Terminal operations
61.659.351.5--
SkillsBench
Practical coding skills
-48.2---
LiveCodeBench
Competitive code generation
-83.980.4--
Claw-Eval Avg
End-to-end agentic coding
-72.468.7--
Context window
Maximum context length
1M tokens128K tokens128K tokens1M tokens1M tokens
Architecture
Model architecture
Proprietary27B Dense35B MoE (3B active)~1T MoE1.6T MoE
preserve_thinking
Agentic reasoning persistence
YesNoNoNoNo
Open-weight
Local deployment available
NoYes (Apache 2.0)Yes (Apache 2.0)TBDTBD

Qwen 3.6 data from official release (March 2026). DeepSeek V4 data from initial launch reports (April 2026). Some DeepSeek V4 benchmarks pending full publication.

Agentic Coding

78.8% SWE-bench Verified - proven results available today

Qwen 3.6 Plus leads on the most important agentic coding benchmarks with reproducible results: 78.8% SWE-bench Verified, 61.6 Terminal-Bench 2.0. The open-weight 27B model reaches 77.2% SWE-bench - rivaling top proprietary models. DeepSeek V4 targets similar capabilities but comprehensive benchmark data is still emerging as the model completes its rollout.

  • 78.8% SWE-bench Verified (Plus), 77.2% (27B open-weight)
  • 61.6 Terminal-Bench 2.0, 72.4 Claw-Eval Avg (27B)
  • Fully available now via API, OpenRouter, and local deployment
78.8% SWE-bench Verified - proven results available today

Open Weight

Apache 2.0 open-weight models run on consumer hardware today

Qwen 3.6 27B runs on a 16GB GPU (IQ4_XS quantization) with 77.2% SWE-bench. The 35B A3B MoE runs on Mac M4 16GB at Q3, achieving 73.4% SWE-bench and 68.7 Claw-Eval. Both models are Apache 2.0 licensed for commercial use. DeepSeek V4 open-weight availability timeline is still being confirmed.

  • 27B dense: 16GB VRAM, 77.2% SWE-bench, Apache 2.0 licensed
  • 35B A3B MoE: runs on Mac M4 16GB, 73.4% SWE-bench
  • Deploy with Ollama, vLLM, llama.cpp, SGLang, or KTransformers
Apache 2.0 open-weight models run on consumer hardware today

Qwen ecosystem

Proven agentic performance, available today, at industry-leading pricing

Qwen 3.6 is fully available with proven benchmarks, open-weight models under Apache 2.0, preserve_thinking for agentic workflows, and pricing at $0.40/$2.40 per million tokens. Don't wait for benchmarks - start building today.

Qwen 3.6 Plus

78.8% SWE-bench, $0.40/M tokens

Try Plus

Qwen 3.6 27B

77.2% SWE-bench, open-weight, Apache 2.0

Try 27B

Qwen 3.6 35B A3B

73.4% SWE-bench, consumer GPU

Try 35B

API access

OpenAI-compatible, preserve_thinking, free tier

View API

Run locally

Ollama, vLLM, llama.cpp, SGLang

Get started

Community

Join the Qwen developer community

Join

Try Qwen 3.6

Don't wait for benchmarks - experience proven agentic performance today

Qwen 3.6 is fully available with 78.8% SWE-bench, preserve_thinking, and $0.40/$2.40 per million tokens. Chat for free, deploy locally with open-weight models, or integrate via the OpenAI-compatible API. Works with Claude Code, OpenClaw, Aider, and Continue.dev.