GLM 4.7

Model Library/GLM 4.7

LLM

Reasoning

Advanced agentic, reasoning and coding model

On-Demand Dedicated 8xH200

Details

Modalities

text

Version

V4.7

Recommended Hardware

8xH200

Estimated Price

Provider

Z.ai

Family

GLM

Parameters

358B

Context

200000 tokens

License

MIT

GLM 4.7: Advanced Coding, Reasoning, and Agentic Model

GLM 4.7 is a 358B parameter language model developed by Z.ai, designed as a comprehensive coding partner with significant improvements over GLM 4.6 across coding, reasoning, tool use, and agentic tasks.

Key Features

Core Coding - Major improvements in real-world software engineering with SWE-bench Verified score of 73.8% (+5.8% over GLM 4.6) and SWE-bench Multilingual at 66.7% (+12.9%)
Vibe Coding - Improved UI generation quality with cleaner, more modern webpage output and better slide generation with accurate layout and sizing
Tool Use - Strong performance in tool-integrated workflows with BrowseComp score of 52% and tau-2-Bench score of 87.4%
Complex Reasoning - Achieves 42.8% on Humanity's Last Exam (HLE) with tools, 95.7% on AIME 2025, and 97.1% on HMMT Feb 2025
Interleaved Thinking - The model thinks before every response and tool call, enabling more deliberate and accurate outputs
Preserved Thinking - Retains thinking blocks across multi-turn conversations, improving coherence in agentic coding workflows
Turn-level Thinking Control - Per-turn control over reasoning depth allows optimization of latency and cost

Use Cases

Code generation, debugging, and real-world software engineering tasks
Multi-turn agentic workflows with tool calling and web browsing
Complex mathematical reasoning and problem solving
Web UI and application development
Terminal-based development and operations
Multi-step research tasks requiring tool integration
Long-form document analysis and generation

Architecture and Thinking Capabilities

GLM 4.7 introduces a refined thinking architecture with three distinct modes. Interleaved thinking allows the model to reason before every response and tool call. Preserved thinking retains reasoning blocks across conversation turns, which is particularly valuable for multi-step coding agent tasks where context continuity improves accuracy. Turn-level thinking provides granular control over when and how deeply the model reasons, allowing users to balance output quality against latency.

The model supports integration with popular coding agent frameworks and provides native tool calling capabilities with structured output for function calling workflows.

Training Approach

GLM 4.7 was trained with a focus on real-world coding performance, agentic task completion, and reasoning depth. The model uses a default evaluation setting of temperature 1.0 with top-p 0.95 for general tasks, with specialized settings for coding benchmarks including temperature 0.7 for SWE-bench and Terminal Bench evaluations.

Deploy GLM 4.7 on Vast.ai for access to advanced coding, reasoning, and agentic capabilities with flexible GPU infrastructure.