GLM 4.7: Advanced Coding, Reasoning, and Agentic Model
GLM 4.7 is a 358B parameter language model developed by Z.ai, designed as a comprehensive coding partner with significant improvements over GLM 4.6 across coding, reasoning, tool use, and agentic tasks.
Key Features
- Core Coding - Major improvements in real-world software engineering with SWE-bench Verified score of 73.8% (+5.8% over GLM 4.6) and SWE-bench Multilingual at 66.7% (+12.9%)
- Vibe Coding - Improved UI generation quality with cleaner, more modern webpage output and better slide generation with accurate layout and sizing
- Tool Use - Strong performance in tool-integrated workflows with BrowseComp score of 52% and tau-2-Bench score of 87.4%
- Complex Reasoning - Achieves 42.8% on Humanity's Last Exam (HLE) with tools, 95.7% on AIME 2025, and 97.1% on HMMT Feb 2025
- Interleaved Thinking - The model thinks before every response and tool call, enabling more deliberate and accurate outputs
- Preserved Thinking - Retains thinking blocks across multi-turn conversations, improving coherence in agentic coding workflows
- Turn-level Thinking Control - Per-turn control over reasoning depth allows optimization of latency and cost
Use Cases
- Code generation, debugging, and real-world software engineering tasks
- Multi-turn agentic workflows with tool calling and web browsing
- Complex mathematical reasoning and problem solving
- Web UI and application development
- Terminal-based development and operations
- Multi-step research tasks requiring tool integration
- Long-form document analysis and generation
Architecture and Thinking Capabilities
GLM 4.7 introduces a refined thinking architecture with three distinct modes. Interleaved thinking allows the model to reason before every response and tool call. Preserved thinking retains reasoning blocks across conversation turns, which is particularly valuable for multi-step coding agent tasks where context continuity improves accuracy. Turn-level thinking provides granular control over when and how deeply the model reasons, allowing users to balance output quality against latency.
The model supports integration with popular coding agent frameworks and provides native tool calling capabilities with structured output for function calling workflows.
Training Approach
GLM 4.7 was trained with a focus on real-world coding performance, agentic task completion, and reasoning depth. The model uses a default evaluation setting of temperature 1.0 with top-p 0.95 for general tasks, with specialized settings for coding benchmarks including temperature 0.7 for SWE-bench and Terminal Bench evaluations.
Deploy GLM 4.7 on Vast.ai for access to advanced coding, reasoning, and agentic capabilities with flexible GPU infrastructure.