Qwen Code

Agentic coding that resolves real GitHub issues, masters terminal workflows, and beats Claude on SkillsBench

The Qwen 3.6 family delivers elite coding performance across every dimension. The Plus model scores 78.8% on SWE-bench Verified and 61.6 on Terminal-Bench 2.0. The 27B dense model achieves 77.2% SWE-bench, 48.2 on SkillsBench (beating Claude 4.5 Opus at 45.3), and 1487 on QwenWebBench for frontend code generation. The 35B A3B MoE brings 73.4% SWE-bench in a consumer GPU footprint. All models work with Claude Code, OpenClaw, Aider, and Continue.dev via the OpenAI-compatible API. preserve_thinking maintains reasoning state across agent loop iterations for iterative development.

Start Coding View benchmarks

Coding capabilities

Full-stack coding from terminal to production - with thinking preservation

Qwen 3.6 models excel at every stage of the software development lifecycle. From understanding large codebases and generating code to debugging, testing, and deploying through terminal workflows. The preserve_thinking parameter maintains reasoning context across iterative development cycles.

Agentic coding (SWE-bench)

Autonomously resolves real-world GitHub issues end-to-end. 78.8% on SWE-bench Verified (Plus) and 77.2% (27B) demonstrate the ability to navigate repositories, identify root causes, implement fixes, and submit working patches without human intervention. The 35B A3B achieves 73.4% in a consumer GPU footprint. These scores place Qwen 3.6 among the top models for autonomous software engineering.

Frontend code generation (QwenWebBench)

The 27B model scores 1487 on QwenWebBench and the 35B A3B scores 1397, demonstrating strong frontend code generation capabilities. Generates complete React, Vue, and Next.js components with proper TypeScript typing, accessibility attributes, responsive layouts, and design system integration. Handles CSS-in-JS, Tailwind CSS, and component library patterns. The preserve_thinking parameter helps maintain design context across multi-file frontend scaffolding.

Terminal operations (Terminal-Bench)

61.6 on Terminal-Bench 2.0 (Plus) and 59.3 (27B) - expert-level terminal mastery. Handles complex multi-step shell workflows, system administration tasks, debugging sessions, CI/CD pipeline management, Docker orchestration, and infrastructure automation. The 35B A3B scores 51.5, still strong for a consumer GPU model.

SkillsBench - beats Claude 4.5 Opus

The 27B model scores 48.2 on SkillsBench, beating Claude 4.5 Opus at 45.3. SkillsBench evaluates practical coding skills including code review, refactoring, API design, testing strategy, and architectural decision-making. This benchmark measures the kind of nuanced engineering judgment that matters in real-world development, not just code generation.

Repository-level reasoning (NL2Repo)

The 27B model scores 36.2 on NL2Repo, demonstrating the ability to translate natural language descriptions into complete repository structures. Understands cross-file dependencies, module boundaries, architectural patterns, and project conventions across entire repositories. The 1M context window (Plus) enables processing complete codebases in a single pass for comprehensive understanding.

Code generation (LiveCodeBench)

83.9 on LiveCodeBench (27B) and 80.4 (35B A3B) for competitive-grade code generation. Produces clean, idiomatic code across Python, TypeScript, Rust, Go, Java, C++, and 20+ languages with proper error handling, documentation, and test coverage. Handles algorithmic problems, data structure implementations, and system design challenges.

Coding tool integration

Works with Claude Code, OpenClaw, Aider, Continue.dev, and Qwen Code via the OpenAI-compatible API. Set the base URL to your DashScope, OpenRouter, or local Ollama endpoint and start coding immediately. The preserve_thinking parameter is especially valuable in Claude Code and OpenClaw agent loops where maintaining reasoning state across iterations reduces redundant re-reasoning and improves fix accuracy.

Debugging, testing, and Claw-Eval

The 27B model scores 72.4 on Claw-Eval average and the 35B A3B scores 68.7, measuring end-to-end agentic coding capability. Traces bugs through complex call stacks, identifies root causes from error logs, and generates comprehensive test suites. Supports unit tests, integration tests, end-to-end testing frameworks, and property-based testing across all major languages and frameworks.

Coding benchmarks

Top-tier results across every coding evaluation

Qwen 3.6 models consistently rank among the best on software engineering, code generation, terminal operations, and practical coding skill benchmarks.

Software engineering benchmarks

SWE-bench Verified: 78.8% (Plus) / 77.2% (27B) / 73.4% (35B A3B)
Terminal-Bench 2.0: 61.6 (Plus) / 59.3 (27B) / 51.5 (35B A3B)
SkillsBench: 48.2 (27B) - beats Claude 4.5 Opus (45.3)
Claw-Eval Avg: 72.4 (27B) / 68.7 (35B A3B)
LiveCodeBench: 83.9 (27B) / 80.4 (35B A3B)
QwenWebBench: 1487 (27B) / 1397 (35B A3B) - frontend generation
NL2Repo: 36.2 (27B) - natural language to repository
SWE-bench Pro: 56.6 (Plus)

Tool and model options

Works with: Claude Code, OpenClaw, Aider, Continue.dev, Qwen Code
27B Dense: Best open-weight coding, 77.2% SWE-bench
35B A3B MoE: 73.4% SWE-bench on consumer GPU (~21GB VRAM)
Plus: 78.8% SWE-bench, 1M context, preserve_thinking
Frontend: React, Vue, Next.js with TypeScript support
preserve_thinking: maintains reasoning across agent iterations

Start Coding Compare models

Get started

Start coding with Qwen 3.6 - multiple paths available

Choose the right model and tool for your coding workflow. From browser chat to local deployment to API integration.

Chat with Qwen Code

Start coding instantly with the best Qwen 3.6 model for your task

Run locally with Ollama

One-command setup: ollama run qwen3.6:35b-a3b for local coding

API access

OpenAI-compatible API for IDE integrations and CI/CD pipelines

Model comparison

Compare 27B, 35B A3B, and Plus for your coding use case

Claude Code setup

Use Qwen 3.6 as a backend for Claude Code via API

Continue.dev integration

AI coding assistant in VS Code with local or API Qwen 3.6

Integration guides

Integrate Qwen Code into your development workflow

Connect Qwen 3.6 to your favorite development tools, IDEs, and CI/CD pipelines for seamless AI-assisted coding.

Claude Code + Qwen

Configure Claude Code to use Qwen 3.6 via OpenAI-compatible API

OpenClaw setup

Agentic coding with OpenClaw and Qwen 3.6 backend

Aider integration

AI pair programming with local or API Qwen 3.6

Terminal workflows

AI-powered terminal operations and command generation

Git workflows

Automated PR reviews, commit messages, and code analysis

Frontend scaffolding

React, Vue, Next.js project generation with TypeScript

Qwen ecosystem

Coding models for every scale - from consumer GPU to frontier performance

From the 35B A3B that runs on a single consumer GPU to the Plus with 1M context and preserve_thinking, the Qwen 3.6 family covers every coding deployment scenario. All models work with Claude Code, OpenClaw, Aider, and Continue.dev.

Explore all models Official documentation

Qwen 3.6 27B

Dense, 77.2% SWE-bench, 48.2 SkillsBench

Learn more

Qwen 3.6 35B A3B

MoE, 73.4% SWE-bench, consumer GPU

Learn more

Qwen 3.6 Plus

78.8% SWE-bench, 1M context, preserve_thinking

Learn more

Ollama setup

Run Qwen Code locally in one command

Get started

API reference

OpenAI-compatible endpoints for coding tasks

View API

Community

Join the Qwen developer community

Join

Start coding

Ready to code with Qwen 3.6? 78.8% SWE-bench, works with your favorite tools

Start chatting for free or integrate via the OpenAI-compatible API. Works with Claude Code, OpenClaw, Aider, and Continue.dev. Choose from open-weight models you can run locally or the Plus for maximum performance with preserve_thinking.

Start Coding Compare models