Qwen 3.6 35B A3B
35 billion parameters, 3 billion active - frontier MoE on consumer hardware
Qwen 3.6 35B A3B is a Mixture-of-Experts model that activates only 3B parameters per token from 256 experts. With 73.4% on SWE-bench Verified, 92.7% on AIME 2026, and Apache 2.0 licensing, it brings frontier-class coding and reasoning to consumer GPUs.
Model variants
Open-weight MoE for local and cloud deployment
Qwen 3.6 35B A3B delivers strong performance with minimal active parameters. Choose the instruction-tuned variant for chat and coding, or the base model for fine-tuning.
Mixture-of-Experts Architecture
35B total parameters, 3B active per token, 256 experts
Qwen 3.6 35B A3B uses a Hybrid Gated DeltaNet + Gated Attention + MoE design with 256 experts, routing 8 experts plus 1 shared expert per token. The 262K native context is extensible to 1M tokens, and the Apache 2.0 license enables unrestricted commercial use.
With only 3B active parameters per token, this model runs efficiently on consumer GPUs while delivering performance that rivals much larger dense models.
Instruction-tuned
35B A3B Instruct
Optimized for conversational AI, coding, and agentic tasks on consumer hardware
Fine-tuned for instruction following and multi-turn dialogue with MoE efficiency
Pre-trained
35B A3B Base
Foundation MoE model for fine-tuning and specialized applications
Pre-trained with 256-expert MoE routing on diverse data
Capabilities
256 experts, 3B active - maximum efficiency meets strong performance
Qwen 3.6 35B A3B combines a massive expert pool with minimal active compute to deliver impressive coding, reasoning, and agentic capabilities on consumer-grade hardware.
Real-world software engineering
73.4% on SWE-bench Verified - resolving real GitHub issues with only 3B active parameters per token. Competitive with models that use 10x more compute at inference time.
Terminal operations
51.5 on Terminal-Bench 2.0 for complex multi-step terminal workflows. Handles debugging, system administration, and build pipeline tasks with strong proficiency.
Advanced mathematics
92.7% on AIME 2026 - near-frontier math reasoning from a model that runs on consumer GPUs. Step-by-step thinking mode enables transparent problem solving.
262K to 1M context
262K native context window extensible to 1M tokens. Analyze entire codebases, long documents, and complex multi-turn conversations without truncation.
Competitive coding
80.4 on LiveCodeBench v6 for algorithmic problem solving. Strong code generation, debugging, and refactoring capabilities across multiple programming languages.
Open-weight freedom
Apache 2.0 license enables unrestricted commercial use, fine-tuning, and redistribution. Full transparency into model weights for research and customization.
Key highlights
Frontier MoE performance on consumer hardware
Qwen 3.6 35B A3B achieves strong results across coding, reasoning, and agentic benchmarks while activating only 3B parameters per token.
Top achievements
- SWE-bench Verified: 73.4% - real-world software engineering
- Terminal-Bench 2.0: 51.5 - complex terminal operations
- AIME 2026: 92.7% - advanced mathematics
- LiveCodeBench v6: 80.4 - competitive coding
- Apache 2.0 license - fully open-weight
Technical specs
- 35B total parameters, 3B active per token
- 256 experts: 8 routed + 1 shared active per token
- Hybrid Gated DeltaNet + Gated Attention + MoE architecture
- 262K native context, extensible to 1M tokens
- Runs locally on consumer GPUs
Performance
Strong MoE performance at 3B active inference cost
Qwen 3.6 35B A3B scores 73.4% on SWE-bench Verified and 92.7% on AIME 2026 while activating only 3B parameters per token - bringing frontier-class capabilities to consumer hardware.
Qwen 3.6 35B A3B demonstrates that sparse MoE architectures with 256 experts can deliver impressive results across software engineering, mathematics, and competitive coding at a fraction of the compute cost.


SWE-bench Verified: 73.4% with only 3B active parameters
Terminal-Bench 2.0: 51.5 for terminal operations
AIME 2026: 92.7% on advanced mathematics
LiveCodeBench v6: 80.4 competitive coding
Apache 2.0 open-weight license
Benchmark comparison
Qwen 3.6 35B A3B vs the Qwen 3.6 family and competitors
Qwen 3.6 35B A3B delivers strong performance across software engineering, terminal operations, and reasoning benchmarks at minimal inference cost.
| Benchmark | Qwen 3.6 35B A3B MoE Featured | Qwen 3.6 27B Dense | Qwen 3.6 Plus Proprietary | Qwen 3 235B A22B MoE |
|---|---|---|---|---|
SWE-bench Verified Real-world software engineering | 73.4% | 77.2% | 78.8% | 76.2% |
Terminal-Bench 2.0 Terminal operations | 51.5 | 59.3 | 61.6 | - |
AIME 2026 Mathematics No tools | 92.7% | 94.1% | - | - |
LiveCodeBench v6 Competitive coding | 80.4 | 83.9 | - | - |
Benchmark results from official Qwen 3.6 model card and HuggingFace evaluations.
256-Expert MoE
35B capacity, 3B inference cost - runs on consumer GPUs
The Mixture-of-Experts design routes each token through 8 of 256 experts plus 1 shared expert. All 35B parameters load for routing diversity, but only 3B activate per forward pass. Combined with the Hybrid Gated DeltaNet + Gated Attention architecture, this enables consumer-GPU deployment with strong performance.
- 3B active parameters per token from 35B total capacity
- 256 experts: 8 routed + 1 shared active per token
- Runs locally on consumer GPUs with quantization

Open Weight
Apache 2.0 - fully open for commercial use and fine-tuning
Qwen 3.6 35B A3B is released under the Apache 2.0 license, enabling unrestricted commercial deployment, fine-tuning, and redistribution. Download weights from HuggingFace and deploy on your own infrastructure with full control.
- Apache 2.0 license - no usage restrictions
- Full weight access for fine-tuning and customization
- Community-driven ecosystem with broad framework support
Get started
Try Qwen 3.6 35B A3B now
Start chatting instantly, or download open-weight models for self-hosted deployment.
Local deployment
Run on your own hardware
Deploy locally on consumer GPUs with quantized weights. Apache 2.0 license for unrestricted use.
Qwen ecosystem
Part of the Qwen 3.6 model family
Qwen 3.6 35B A3B is the open-weight MoE variant in Alibaba's latest model family, designed for maximum accessibility on consumer hardware.
Get started
Ready to build with Qwen 3.6 35B A3B?
Start chatting instantly for free, or download open-weight models under Apache 2.0 for self-hosted deployment on consumer hardware.