Model Library/Kimi K2.6

Moonshot AI logoKimi K2.6

LLM
Reasoning
Vision Language

Kimi K2.6 is an open-source, native multimodal agentic MoE model from Moonshot AI with 1T total parameters, 32B activated, advancing long-horizon coding, coding-driven design, and swarm-based task orchestration

On-Demand Dedicated 8xH200

Details

Modalities

text, vision

Version

2.6

Recommended Hardware

8xH200

Estimated Price

Loading...

Provider

Moonshot AI

Family

Kimi K2

Parameters

1000B

Context

256000 tokens

License

MIT (Modified)

Kimi K2.6

Kimi K2.6 is an open-source, native multimodal agentic model from Moonshot AI that advances practical capabilities in long-horizon coding, coding-driven design, proactive autonomous execution, and swarm-based task orchestration. It is a Mixture-of-Experts model with 1 trillion total parameters and 32 billion activated per token, built on the Kimi K2.5 architecture.

Key Features

  • Long-Horizon Coding — Significant improvements on complex, end-to-end coding tasks, generalizing robustly across programming languages (Rust, Go, Python) and domains spanning front-end, DevOps, and performance optimization.
  • Coding-Driven Design — Transforms simple prompts and visual inputs into production-ready interfaces and lightweight full-stack workflows, generating structured layouts, interactive elements, and rich animations with deliberate aesthetic precision.
  • Elevated Agent Swarm — Scales horizontally to 300 sub-agents executing 4,000 coordinated steps; dynamically decomposes tasks into parallel, domain-specialized subtasks, delivering end-to-end outputs from documents to websites to spreadsheets in a single autonomous run.
  • Proactive & Open Orchestration — Demonstrates strong performance in powering persistent 24/7 background agents that proactively manage schedules, execute code, and orchestrate cross-platform operations without human oversight.
  • Thinking & Instant Modes — Supports reasoning (thinking) mode by default and an instant-response mode; preserve_thinking retains full reasoning content across multi-turn interactions for coding-agent scenarios.
  • Multimodal Input — Accepts text, image, and video input via the MoonViT vision encoder (400M parameters).

Model Summary

| | | |:---|:---| | Architecture | Mixture-of-Experts (MoE) | | Total Parameters | 1T | | Activated Parameters | 32B | | Number of Layers | 61 (1 dense + 60 MoE) | | Number of Experts | 384 (8 selected per token, 1 shared) | | Attention Hidden Dimension | 7168 | | MoE Hidden Dimension per Expert | 2048 | | Number of Attention Heads | 64 | | Vocabulary Size | 160K | | Context Length | 256K | | Attention Mechanism | MLA | | Activation Function | SwiGLU | | Vision Encoder | MoonViT (400M parameters) |

Kimi K2.6 ships with native INT4 quantization, using the same method as Kimi K2 Thinking.

Benchmarks

Agentic

  • HLE-Full (with tools): 54.0
  • BrowseComp: 83.2 (86.3 with Agent Swarm)
  • DeepSearchQA (f1-score): 92.5
  • DeepSearchQA (accuracy): 83.0
  • WideSearch (item-f1): 80.8
  • Toolathlon: 50.0
  • MCPMark: 55.9
  • Claw Eval (pass^3): 62.3; (pass@3): 80.9
  • APEX-Agents: 27.9
  • OSWorld-Verified: 73.1

Coding

  • Terminal-Bench 2.0 (Terminus-2): 66.7
  • SWE-Bench Pro: 58.6
  • SWE-Bench Multilingual: 76.7
  • SWE-Bench Verified: 80.2
  • SciCode: 52.2
  • OJBench (python): 60.6
  • LiveCodeBench (v6): 89.6

Reasoning & Knowledge

  • HLE-Full: 34.7
  • AIME 2026: 96.4
  • HMMT 2026 (Feb): 92.7
  • IMO-AnswerBench: 86.0
  • GPQA-Diamond: 90.5

Vision

  • MMMU-Pro: 79.4 (80.1 with python)
  • CharXiv (RQ): 80.4 (86.7 with python)
  • MathVision: 87.4 (93.2 with python)
  • BabyVision: 39.8 (68.5 with python)
  • V* (with python): 96.9

Use Cases

  • Autonomous agentic workflows spanning coding, research, and browsing
  • Long-horizon software engineering and multi-step code generation
  • Coding-driven UI/UX design from prompts and visual inputs
  • Document, chart, and image understanding at scale
  • Multi-agent task orchestration with parallel sub-agent coordination
  • Persistent background agents for schedule management and cross-platform operations

Quick Start Guide

Choose a model and click 'Deploy' above to find available GPUs recommended for this model.

Rent your dedicated instance preconfigured with the model you've selected.

Start sending requests to your model instance and getting responses right now.