Kimi K2.5

LLM

Reasoning

Kimi K2.5 is an open-source, native multimodal agentic model built through continual pretraining on approximately 15 trillion mixed visual and text tokens atop Kimi-K2-Base

On-Demand Dedicated 8xH200

Details

Modalities

text, vision

Version

2.5

Recommended Hardware

8xH200

Estimated Price

Provider

Moonshot AI

Family

Kimi K2

Parameters

1000B

Context

256000 tokens

License

MIT (Modified)

Kimi K2.5

Kimi K2.5 is an open-source, native multimodal agentic model developed by Moonshot AI. Built through continual pretraining on approximately 15 trillion mixed visual and text tokens atop Kimi-K2-Base, this model seamlessly integrates vision and language understanding with advanced agentic capabilities.

Model Overview

Kimi K2.5 represents a significant advancement in multimodal AI, combining a trillion-parameter Mixture-of-Experts (MoE) architecture with native vision capabilities. The model activates 32 billion parameters per inference while maintaining efficiency through its expert-based design with 384 total experts and 8 selected per token.

The architecture features 61 layers, Multi-Latent Attention (MLA) for efficient attention computation, and a 400M parameter MoonViT vision encoder. This design enables the model to process text, images, and video inputs within a unified framework.

Key Capabilities

Native Multimodality

Unlike models that retrofit vision capabilities, Kimi K2.5 was pre-trained on vision-language tokens from the ground up. This native multimodal approach enables superior visual knowledge extraction and cross-modal reasoning, allowing the model to understand and reason about visual content with the same fluency as text.

Coding with Vision

Kimi K2.5 can generate code directly from visual specifications, transforming UI designs and video workflows into functional implementations. The model autonomously orchestrates tools for visual data processing, bridging the gap between design and development.

Agent Swarm

The model introduces a novel agent swarm capability, transitioning from single-agent execution to self-directed, coordinated multi-agent workflows. Kimi K2.5 can decompose complex tasks into parallel sub-tasks and dynamically instantiate domain-specific agents to handle them, enabling sophisticated problem-solving at scale.

Operational Modes

Kimi K2.5 supports two distinct operational modes:

Thinking Mode (Default): Provides detailed reasoning content alongside responses, ideal for complex analytical tasks. Uses temperature 1.0 and top_p 0.95 for optimal performance.

Instant Mode: Delivers faster responses with disabled thinking, suitable for straightforward queries. Uses temperature 0.6 for more focused outputs.

Benchmark Performance

Kimi K2.5 demonstrates strong performance across diverse evaluation benchmarks:

Reasoning and Knowledge

AIME 2025: 96.1
GPQA-Diamond: 87.6
MMLU-Pro: 87.1
HLE-Full (with tools): 50.2

Vision and Multimodal

MMMU-Pro: 78.5
VideoMMMU: 86.6
OCRBench: 92.3
OmniDocBench: 88.8
InfoVQA: 92.6

Coding

SWE-Bench Verified: 76.8
SWE-Bench Pro: 50.7
LiveCodeBench: 85.0
Terminal Bench 2.0: 50.8

Agentic Search

BrowseComp (Agent Swarm): 78.4
WideSearch (Agent Swarm): 79.0
DeepSearchQA: 77.1

Long Context

Longbench v2: 61.0
AA-LCR: 70.0

Use Cases

Kimi K2.5 excels in a variety of applications:

Multimodal Analysis: Understanding and reasoning about images, videos, and text in unified workflows
Complex Reasoning: Solving mathematical, logical, and analytical problems with detailed explanations
Software Engineering: Generating, reviewing, and debugging code across multiple languages
Visual Coding: Converting UI/UX designs directly into functional code
Document Understanding: Extracting information from documents using advanced OCR capabilities
Multi-step Problem Solving: Orchestrating tools and agents to tackle complex, multi-faceted tasks
Information Retrieval: Conducting thorough research using coordinated agent swarm capabilities

Quick Start Guide

Choose a model and click 'Deploy' above to find available GPUs recommended for this model.

Rent your dedicated instance preconfigured with the model you've selected.

Start sending requests to your model instance and getting responses right now.

Related Models

text

Kimi K2 Instruct 0905

Open-source trillion-parameter MoE AI model

text

Kimi K2 Thinking

Open-source trillion-parameter MoE AI model with thinking

Kimi K2.5

Details

Modalities

Version

Recommended Hardware

Estimated Price

Provider

Family

Parameters

Context

License

Kimi K2.5

Model Overview

Key Capabilities

Native Multimodality

Coding with Vision

Agent Swarm

Operational Modes

Benchmark Performance

Reasoning and Knowledge

Vision and Multimodal

Coding

Agentic Search

Long Context

Use Cases

Quick Start Guide

Related Models

Kimi K2 Instruct 0905

Kimi K2 Thinking

Subscribe for our product updates.