Qwen 3.6 vs Kimi K2.6
Two agentic powerhouses - Kimi K2.6 leads Terminal-Bench, Qwen 3.6 leads SWE-bench and offers open-weight flexibility
Kimi K2.6 from Moonshot AI scored 66.7% on Terminal-Bench 2.0 and sustained 4,000+ tool calls over 13 hours, demonstrating exceptional long-running agent endurance. Qwen 3.6 Plus scores 61.6 on Terminal-Bench but leads with 78.8% SWE-bench Verified and the preserve_thinking parameter for maintaining reasoning state. The 27B open-weight model achieves 77.2% SWE-bench and 48.2 SkillsBench (beating Claude 4.5 Opus). Qwen offers open-weight models, local deployment, and API pricing at $0.40/$2.40 per million tokens.
Benchmarks
Qwen 3.6 vs Kimi K2.6 - comprehensive agentic benchmark comparison
Both models represent the state of the art in agentic coding. Kimi K2.6 leads on Terminal-Bench and endurance, while Qwen 3.6 leads on SWE-bench, SkillsBench, and offers broader benchmark coverage with open-weight deployment options.
The agentic AI landscape is evolving rapidly, with both Qwen 3.6 and Kimi K2.6 pushing boundaries in different directions. Kimi K2.6's Terminal-Bench score (66.7%) and endurance testing (4,000+ tool calls over 13 hours) demonstrate exceptional long-running agent capabilities. Qwen 3.6 provides a more complete ecosystem with 78.8% SWE-bench, open-weight models, preserve_thinking, competitive pricing, and integration with popular coding tools.


Terminal-Bench 2.0: Kimi K2.6 66.7% vs Qwen 3.6 Plus 61.6
Kimi K2.6: 4,000+ tool calls sustained over 13 hours
Qwen 3.6 Plus: 78.8% SWE-bench Verified
Qwen 3.6 27B: 77.2% SWE-bench, 48.2 SkillsBench (beats Claude 4.5 Opus)
Qwen 3.6 27B: 83.9 LiveCodeBench, 1487 QwenWebBench, 72.4 Claw-Eval
Benchmark table
Qwen 3.6 vs Kimi K2.6 - detailed results across all evaluations
Available benchmark data for both model families across agentic coding, software engineering, practical skills, and endurance evaluations.
| Benchmark | Qwen 3.6 Plus Proprietary | Qwen 3.6 27B Dense open-weight | Qwen 3.6 35B A3B MoE open-weight | Kimi K2.6 Proprietary Terminal-Bench leader |
|---|---|---|---|---|
Terminal-Bench 2.0 Terminal operations | 61.6 | 59.3 | 51.5 | 66.7 |
SWE-bench Verified Real-world software engineering | 78.8% | 77.2% | 73.4% | - |
SkillsBench Practical coding skills | - | 48.2 | - | - |
LiveCodeBench Competitive code generation | - | 83.9 | 80.4 | - |
QwenWebBench Frontend code generation | - | 1487 | 1397 | - |
Claw-Eval Avg End-to-end agentic coding | - | 72.4 | 68.7 | - |
Max tool calls (single session) Agent endurance | - | - | - | 4,000+ |
Max session duration Sustained operation | - | - | - | 13 hours |
preserve_thinking Reasoning state persistence | Yes | No | No | No |
Open-weight models Local deployment available | No | Yes (Apache 2.0) | Yes (Apache 2.0) | No |
Qwen 3.6 data from official release (March 2026). Kimi K2.6 data from Moonshot AI release (April 20, 2026). SkillsBench reference: Claude 4.5 Opus scores 45.3.
Agentic Coding
Qwen 3.6 leads on agentic coding with proven open-weight models
Qwen 3.6 Plus delivers 78.8% SWE-bench Verified and 61.6 Terminal-Bench 2.0. The open-weight 27B model achieves 77.2% SWE-bench and 48.2 SkillsBench - beating Claude 4.5 Opus. Kimi K2.6 targets similar agentic use cases but Qwen 3.6 provides full transparency with published benchmark results and open-weight models for local verification.
- 78.8% SWE-bench Verified (Plus), 77.2% (27B open-weight)
- 61.6 Terminal-Bench 2.0, 48.2 SkillsBench (27B, beats Claude 4.5 Opus)
- preserve_thinking parameter for agentic workflow state persistence

Price-Performance
$0.40/M tokens with free tier - the most accessible agentic model
Qwen 3.6 Plus via DashScope costs $0.40 input / $2.40 output per million tokens - roughly 12x cheaper than Claude Opus 4.6. OpenRouter free preview tier requires no credit card. Open-weight 27B and 35B A3B models enable zero per-token cost with local deployment. Works with Claude Code, Aider, Continue.dev, and any OpenAI-compatible framework.
- $0.40/$2.40 per M tokens via DashScope (~12x cheaper than Claude Opus 4.6)
- Free tier via OpenRouter, no credit card required
- Zero cost with local deployment via Ollama, vLLM, or llama.cpp

Try Qwen 3.6
Start using Qwen 3.6 today
Try the free chat, integrate via API, or deploy open-weight models locally.
Compare and deploy
Explore both model families
Compare Qwen and Kimi K2.6, or deploy Qwen open-weight models locally.
Qwen ecosystem
Agentic performance with open-weight flexibility and competitive pricing
Qwen 3.6 combines strong agentic benchmarks (78.8% SWE-bench) with open-weight models, preserve_thinking, $0.40/M token pricing, and integration with Claude Code, OpenClaw, Aider, and Continue.dev.
Try Qwen 3.6
Experience Qwen 3.6's agentic capabilities today - free chat, open-weight, competitive pricing
Chat for free, deploy locally with open-weight models under Apache 2.0, or integrate via the OpenAI-compatible API at $0.40/$2.40 per million tokens. preserve_thinking for agentic workflows, works with Claude Code, OpenClaw, Aider, and Continue.dev.