Kimi K2.6

LLM

Reasoning

Vision Language

Kimi K2.6 is an open-source, native multimodal agentic MoE model from Moonshot AI with 1T total parameters, 32B activated, advancing long-horizon coding, coding-driven design, and swarm-based task orchestration

On-Demand Dedicated 8xH200

Details

Modalities

text, vision

Version

2.6

Recommended Hardware

8xH200

Estimated Price

Provider

Moonshot AI

Family

Kimi K2

Parameters

1000B

Context

256000 tokens

License

MIT (Modified)

Kimi K2.6

Kimi K2.6 is an open-source, native multimodal agentic model from Moonshot AI that advances practical capabilities in long-horizon coding, coding-driven design, proactive autonomous execution, and swarm-based task orchestration. It is a Mixture-of-Experts model with 1 trillion total parameters and 32 billion activated per token, built on the Kimi K2.5 architecture.

Key Features

Long-Horizon Coding — Significant improvements on complex, end-to-end coding tasks, generalizing robustly across programming languages (Rust, Go, Python) and domains spanning front-end, DevOps, and performance optimization.
Coding-Driven Design — Transforms simple prompts and visual inputs into production-ready interfaces and lightweight full-stack workflows, generating structured layouts, interactive elements, and rich animations with deliberate aesthetic precision.
Elevated Agent Swarm — Scales horizontally to 300 sub-agents executing 4,000 coordinated steps; dynamically decomposes tasks into parallel, domain-specialized subtasks, delivering end-to-end outputs from documents to websites to spreadsheets in a single autonomous run.
Proactive & Open Orchestration — Demonstrates strong performance in powering persistent 24/7 background agents that proactively manage schedules, execute code, and orchestrate cross-platform operations without human oversight.
Thinking & Instant Modes — Supports reasoning (thinking) mode by default and an instant-response mode; preserve_thinking retains full reasoning content across multi-turn interactions for coding-agent scenarios.
Multimodal Input — Accepts text, image, and video input via the MoonViT vision encoder (400M parameters).

Model Summary

| | | |:---|:---| | Architecture | Mixture-of-Experts (MoE) | | Total Parameters | 1T | | Activated Parameters | 32B | | Number of Layers | 61 (1 dense + 60 MoE) | | Number of Experts | 384 (8 selected per token, 1 shared) | | Attention Hidden Dimension | 7168 | | MoE Hidden Dimension per Expert | 2048 | | Number of Attention Heads | 64 | | Vocabulary Size | 160K | | Context Length | 256K | | Attention Mechanism | MLA | | Activation Function | SwiGLU | | Vision Encoder | MoonViT (400M parameters) |

Kimi K2.6 ships with native INT4 quantization, using the same method as Kimi K2 Thinking.

Benchmarks

Agentic

HLE-Full (with tools): 54.0
BrowseComp: 83.2 (86.3 with Agent Swarm)
DeepSearchQA (f1-score): 92.5
DeepSearchQA (accuracy): 83.0
WideSearch (item-f1): 80.8
Toolathlon: 50.0
MCPMark: 55.9
Claw Eval (pass^3): 62.3; (pass@3): 80.9
APEX-Agents: 27.9
OSWorld-Verified: 73.1

Coding

Terminal-Bench 2.0 (Terminus-2): 66.7
SWE-Bench Pro: 58.6
SWE-Bench Multilingual: 76.7
SWE-Bench Verified: 80.2
SciCode: 52.2
OJBench (python): 60.6
LiveCodeBench (v6): 89.6

Reasoning & Knowledge

HLE-Full: 34.7
AIME 2026: 96.4
HMMT 2026 (Feb): 92.7
IMO-AnswerBench: 86.0
GPQA-Diamond: 90.5

Vision

MMMU-Pro: 79.4 (80.1 with python)
CharXiv (RQ): 80.4 (86.7 with python)
MathVision: 87.4 (93.2 with python)
BabyVision: 39.8 (68.5 with python)
V* (with python): 96.9

Use Cases

Autonomous agentic workflows spanning coding, research, and browsing
Long-horizon software engineering and multi-step code generation
Coding-driven UI/UX design from prompts and visual inputs
Document, chart, and image understanding at scale
Multi-agent task orchestration with parallel sub-agent coordination
Persistent background agents for schedule management and cross-platform operations

Quick Start Guide

Choose a model and click 'Deploy' above to find available GPUs recommended for this model.

Rent your dedicated instance preconfigured with the model you've selected.

Start sending requests to your model instance and getting responses right now.

Related Models

textvision

Kimi K2.5

Kimi K2.5 is an open-source, native multimodal agentic model built through continual pretraining on approximately 15 trillion mixed visual and text tokens atop Kimi-K2-Base

text

Kimi K2 Thinking

Open-source trillion-parameter MoE AI model with thinking

text

Kimi K2 Instruct 0905

Open-source trillion-parameter MoE AI model