Qwen 3.6 Plus
1M context, agentic mastery - the proprietary model that outperforms Claude 4.5 Opus on terminal tasks
Qwen 3.6 Plus is Alibaba's proprietary hosted model with a 1M token context window. It scores 78.8% on SWE-bench Verified, 61.6 on Terminal-Bench 2.0 (beating Claude 4.5 Opus at 59.3), and introduces preserve_thinking for seamless agent loops. Available via OpenAI-compatible API since March 31, 2026.
Capabilities
Purpose-built for agentic workflows and long-context tasks
Qwen 3.6 Plus combines a 1M token context window with the preserve_thinking parameter and top-tier coding benchmarks to deliver a model optimized for complex, multi-step agent pipelines.
Elite software engineering
78.8% on SWE-bench Verified and 56.6 on SWE-bench Pro. Resolves real-world GitHub issues with deep codebase understanding across the full 1M context window.
Terminal mastery
61.6 on Terminal-Bench 2.0 - surpassing Claude 4.5 Opus (59.3). Handles complex multi-step terminal workflows, debugging sessions, and system administration with expert proficiency.
Agentic tool use
57.2 on QwenClawBench and 48.2 on MCPMark for tool orchestration. The preserve_thinking parameter maintains reasoning state across agent loop iterations without token waste.
1M token context
Process entire codebases, long research papers, and extended multi-turn conversations. 70.7 on TAU3-Bench demonstrates strong long-context task completion.
Document understanding
91.2 on OmniDocBench1.5 and 94.4 on AI2D_TEST. Excels at parsing complex documents, diagrams, and visual information with high accuracy.
OpenAI-compatible API
Drop-in replacement for existing OpenAI API integrations. The preserve_thinking parameter extends the standard API for agentic use cases without breaking compatibility.
Key highlights
Agentic performance that leads the field
Qwen 3.6 Plus achieves top-tier results across software engineering, terminal operations, tool use, and document understanding benchmarks.
Top achievements
- SWE-bench Verified: 78.8% - real-world software engineering
- Terminal-Bench 2.0: 61.6 - beats Claude 4.5 Opus (59.3)
- SWE-bench Pro: 56.6 - advanced software engineering
- QwenClawBench: 57.2 - agentic tool orchestration
- MCPMark: 48.2 - MCP protocol tool use
Technical specs
- Proprietary hosted model by Alibaba Cloud
- 1M token context window
- preserve_thinking parameter for agent loops
- OpenAI-compatible API
- Released March 31, 2026
Performance
Agentic dominance with 1M context and preserve_thinking
Qwen 3.6 Plus scores 78.8% on SWE-bench Verified and 61.6 on Terminal-Bench 2.0, establishing a new standard for proprietary agentic models with its 1M token context and preserve_thinking capability.
Qwen 3.6 Plus demonstrates consistent leadership across software engineering, terminal operations, agentic tool use, and document understanding - purpose-built for complex multi-step workflows that demand long-context reasoning.


SWE-bench Verified: 78.8% - real-world software engineering
Terminal-Bench 2.0: 61.6 - beats Claude 4.5 Opus (59.3)
SWE-bench Pro: 56.6 - advanced software engineering
QwenClawBench: 57.2 - agentic tool orchestration
OmniDocBench1.5: 91.2 - document understanding
Benchmark comparison
Qwen 3.6 Plus vs frontier proprietary models
Qwen 3.6 Plus leads on agentic and software engineering benchmarks, with the preserve_thinking parameter enabling seamless multi-step agent workflows.
| Benchmark | Qwen 3.6 Plus Proprietary Featured | Qwen 3.6 27B Dense | Claude 4.5 Opus Proprietary | Qwen 3.6 Max Proprietary |
|---|---|---|---|---|
SWE-bench Verified Real-world software engineering | 78.8% | 77.2% | - | - |
Terminal-Bench 2.0 Terminal operations | 61.6 | 59.3 | 59.3 | - |
SWE-bench Pro Advanced software engineering | 56.6 | - | - | - |
QwenClawBench Agentic tool orchestration | 57.2 | - | - | - |
TAU3-Bench Long-context task completion | 70.7 | - | - | - |
MCPMark MCP protocol tool use | 48.2 | - | - | - |
OmniDocBench1.5 Document understanding | 91.2 | - | - | - |
AI2D_TEST Diagram understanding | 94.4 | - | - | - |
Benchmark results from official Qwen 3.6 release. Released March 31, 2026.
preserve_thinking
Maintain reasoning state across agent loop iterations
The preserve_thinking parameter is a first-of-its-kind API feature that lets agent frameworks maintain the model's internal reasoning state across multiple tool-call iterations. Instead of discarding chain-of-thought tokens between steps, preserve_thinking keeps them active, reducing redundant re-reasoning and improving multi-step task accuracy.
- Maintains reasoning context across agent loop iterations
- Reduces redundant re-reasoning in multi-step workflows
- OpenAI-compatible API with preserve_thinking extension

1M Context
Process entire codebases and long documents in a single pass
Qwen 3.6 Plus supports a 1M token context window, enabling analysis of entire repositories, long research papers, and extended multi-turn conversations. Combined with 70.7 on TAU3-Bench and 91.2 on OmniDocBench1.5, it excels at tasks that demand deep long-context understanding.
- 1M token context window for entire codebases
- 70.7 on TAU3-Bench long-context task completion
- 91.2 on OmniDocBench1.5 document understanding
Get started
Try Qwen 3.6 Plus now
Start chatting instantly or integrate via the OpenAI-compatible API.
Integration guides
Build with Qwen 3.6 Plus
Integrate Qwen 3.6 Plus into your applications with OpenAI-compatible SDKs and agent frameworks.
Qwen ecosystem
Part of the Qwen 3.6 model family
Qwen 3.6 Plus is the proprietary agentic variant in Alibaba's latest model family, optimized for long-context workflows and multi-step tool use.
Get started
Ready to build with Qwen 3.6 Plus?
Start chatting instantly for free, or integrate via the OpenAI-compatible API with preserve_thinking for agentic workflows.