Qwen 3.6 Plus

1M context, agentic mastery - the proprietary model that outperforms Claude 4.5 Opus on terminal tasks

Qwen 3.6 Plus is Alibaba's proprietary hosted model with a 1M token context window. It scores 78.8% on SWE-bench Verified, 61.6 on Terminal-Bench 2.0 (beating Claude 4.5 Opus at 59.3), and introduces preserve_thinking for seamless agent loops. Available via OpenAI-compatible API since March 31, 2026.

Capabilities

Purpose-built for agentic workflows and long-context tasks

Qwen 3.6 Plus combines a 1M token context window with the preserve_thinking parameter and top-tier coding benchmarks to deliver a model optimized for complex, multi-step agent pipelines.

Elite software engineering

78.8% on SWE-bench Verified and 56.6 on SWE-bench Pro. Resolves real-world GitHub issues with deep codebase understanding across the full 1M context window.

Terminal mastery

61.6 on Terminal-Bench 2.0 - surpassing Claude 4.5 Opus (59.3). Handles complex multi-step terminal workflows, debugging sessions, and system administration with expert proficiency.

Agentic tool use

57.2 on QwenClawBench and 48.2 on MCPMark for tool orchestration. The preserve_thinking parameter maintains reasoning state across agent loop iterations without token waste.

1M token context

Process entire codebases, long research papers, and extended multi-turn conversations. 70.7 on TAU3-Bench demonstrates strong long-context task completion.

Document understanding

91.2 on OmniDocBench1.5 and 94.4 on AI2D_TEST. Excels at parsing complex documents, diagrams, and visual information with high accuracy.

OpenAI-compatible API

Drop-in replacement for existing OpenAI API integrations. The preserve_thinking parameter extends the standard API for agentic use cases without breaking compatibility.

Key highlights

Agentic performance that leads the field

Qwen 3.6 Plus achieves top-tier results across software engineering, terminal operations, tool use, and document understanding benchmarks.

Top achievements

  • SWE-bench Verified: 78.8% - real-world software engineering
  • Terminal-Bench 2.0: 61.6 - beats Claude 4.5 Opus (59.3)
  • SWE-bench Pro: 56.6 - advanced software engineering
  • QwenClawBench: 57.2 - agentic tool orchestration
  • MCPMark: 48.2 - MCP protocol tool use

Technical specs

  • Proprietary hosted model by Alibaba Cloud
  • 1M token context window
  • preserve_thinking parameter for agent loops
  • OpenAI-compatible API
  • Released March 31, 2026

Performance

Agentic dominance with 1M context and preserve_thinking

Qwen 3.6 Plus scores 78.8% on SWE-bench Verified and 61.6 on Terminal-Bench 2.0, establishing a new standard for proprietary agentic models with its 1M token context and preserve_thinking capability.

Qwen 3.6 Plus demonstrates consistent leadership across software engineering, terminal operations, agentic tool use, and document understanding - purpose-built for complex multi-step workflows that demand long-context reasoning.

Qwen 3.6 Plus performance comparison chart across coding, agentic, and document understanding benchmarks

SWE-bench Verified: 78.8% - real-world software engineering

Terminal-Bench 2.0: 61.6 - beats Claude 4.5 Opus (59.3)

SWE-bench Pro: 56.6 - advanced software engineering

QwenClawBench: 57.2 - agentic tool orchestration

OmniDocBench1.5: 91.2 - document understanding

Benchmark comparison

Qwen 3.6 Plus vs frontier proprietary models

Qwen 3.6 Plus leads on agentic and software engineering benchmarks, with the preserve_thinking parameter enabling seamless multi-step agent workflows.

Benchmark
Qwen 3.6 Plus
Proprietary
Featured
Qwen 3.6 27B
Dense
Claude 4.5 Opus
Proprietary
Qwen 3.6 Max
Proprietary
SWE-bench Verified
Real-world software engineering
78.8%77.2%--
Terminal-Bench 2.0
Terminal operations
61.659.359.3-
SWE-bench Pro
Advanced software engineering
56.6---
QwenClawBench
Agentic tool orchestration
57.2---
TAU3-Bench
Long-context task completion
70.7---
MCPMark
MCP protocol tool use
48.2---
OmniDocBench1.5
Document understanding
91.2---
AI2D_TEST
Diagram understanding
94.4---

Benchmark results from official Qwen 3.6 release. Released March 31, 2026.

preserve_thinking

Maintain reasoning state across agent loop iterations

The preserve_thinking parameter is a first-of-its-kind API feature that lets agent frameworks maintain the model's internal reasoning state across multiple tool-call iterations. Instead of discarding chain-of-thought tokens between steps, preserve_thinking keeps them active, reducing redundant re-reasoning and improving multi-step task accuracy.

  • Maintains reasoning context across agent loop iterations
  • Reduces redundant re-reasoning in multi-step workflows
  • OpenAI-compatible API with preserve_thinking extension
Maintain reasoning state across agent loop iterations

1M Context

Process entire codebases and long documents in a single pass

Qwen 3.6 Plus supports a 1M token context window, enabling analysis of entire repositories, long research papers, and extended multi-turn conversations. Combined with 70.7 on TAU3-Bench and 91.2 on OmniDocBench1.5, it excels at tasks that demand deep long-context understanding.

  • 1M token context window for entire codebases
  • 70.7 on TAU3-Bench long-context task completion
  • 91.2 on OmniDocBench1.5 document understanding

Qwen ecosystem

Part of the Qwen 3.6 model family

Qwen 3.6 Plus is the proprietary agentic variant in Alibaba's latest model family, optimized for long-context workflows and multi-step tool use.

Documentation

Complete guides for API integration and agent workflows

Read docs

API Reference

OpenAI-compatible endpoints with preserve_thinking

View API

Model Card

Technical specifications and evaluation results

View details

Pricing

Usage-based pricing for API access

View pricing

Agent Frameworks

Integration guides for LangChain, AutoGen, and more

Get started

Community

Join the Qwen developer community

Join

Get started

Ready to build with Qwen 3.6 Plus?

Start chatting instantly for free, or integrate via the OpenAI-compatible API with preserve_thinking for agentic workflows.