DeepSeek R1-0528: Enhanced Reasoning with Simplified Thinking Mode on Vast.ai

July 16, 2025
5 Min Read
By Team Vast

The DeepSeek R1-0528 release introduces a significant improvement to the reasoning capabilities of language models by making it easier to access "thinking" mode without requiring complex prompt engineering or pre-pending thinking tokens. This enhanced model provides transparent, step-by-step reasoning that's particularly valuable for educational applications, complex problem solving, and scenarios where transparency in AI decision-making is crucial.

In this post, we'll explore how to deploy the DeepSeek-R1-0528-Qwen3-8B model using vLLM on Vast.ai's cloud GPU platform, leveraging the new qwen3 reasoning parser that simplifies access to the model's internal thinking process.


The Open Source AI Revolution

DeepSeek R1-0528 represents more than just a technical advancement—it's a significant step toward democratizing advanced AI capabilities. As an open source model, it makes these reasoning and advanced AI capabilities more accessible.

Unlike closed-source models that require vendor dependence, usage costs, and data privacy concerns, DeepSeek R1-0528 enables complete control over your AI infrastructure. Organizations can now deploy advanced reasoning capabilities locally, maintaining data sovereignty while avoiding variable per-token pricing.

This democratization means organizations can access cutting-edge AI without massive infrastructure investments. Advanced reasoning capabilities are no longer exclusive to large tech companies—anyone with GPU access can deploy their own reasoning-capable AI system.


What Makes DeepSeek R1-0528 Special?

The DeepSeek R1-0528 release addresses a key limitation in previous DeepSeek models: the complexity of accessing the model's internal reasoning process. Previously, users needed to manually prepend "thinking" tokens to model outputs or use complex prompt engineering to see step-by-step reasoning.


Deploying DeepSeek R1-0528 on Vast.ai

Step 1: Set Up Vast.ai Environment

First, install the Vast CLI and configure your API key:

pip install --upgrade vastai

export VAST_API_KEY="your_vast_api_key"
vastai set api-key $VAST_API_KEY

Step 2: Find Suitable Hardware

The DeepSeek-R1-0528-Qwen3-8B model requires specific hardware capabilities:

  • Minimum 24GB GPU RAM for model weights and KV cache
  • Single GPU configuration (sufficient for the 8B parameter model)
  • Static IP address for stable API endpoint hosting
  • Direct port access for the vLLM OpenAI-compatible server
  • 60GB+ disk space for model storage and dependencies

Search for suitable instances:

vastai search offers "compute_cap >= 750 \
gpu_ram >= 24 \
num_gpus = 1 \
static_ip = true \
direct_port_count >= 1 \
verified = true \
disk_space >= 60 \
rentable = true"

Step 3: Deploy the Model

Deploy using the vLLM OpenAI-compatible server with the qwen3 reasoning parser:

export INSTANCE_ID="your_instance_id"

vastai create instance $INSTANCE_ID \
--image vllm/vllm-openai:latest \
--env '-p 8000:8000' \
--disk 60 \
--args --model deepseek-ai/DeepSeek-R1-0528-Qwen3-8B \
--max-model-len 4096 \
--reasoning-parser qwen3

Key deployment parameters:

  • --reasoning-parser qwen3: Enables the simplified reasoning parser
  • --max-model-len 4096: Accommodates longer reasoning sequences
  • --tensor-parallel-size 1: Single GPU configuration
  • --gpu-memory-utilization 0.90: Optimized memory usage

Step 4: Get Instance Details and Test Connection

Navigate to the Instances tab in the Vast AI Console and locate your instance. Click the IP address button to view the forwarded ports.

Test the connection with a simple curl command:

export VAST_IP_ADDRESS="your_ip_address"
export VAST_PORT="your_port"

curl -X POST http://$VAST_IP_ADDRESS:$VAST_PORT/v1/completions \
-H "Content-Type: application/json" \
-d '{"model": "deepseek-ai/DeepSeek-R1-0528-Qwen3-8B", "prompt": "Hello, how are you?", "max_tokens": 50}'

Testing the Model's Reasoning Capabilities

Setup Python Client

import time
from openai import OpenAI

# Replace with your actual Vast.ai instance details
VAST_IP_ADDRESS = "your_ip_address"
VAST_PORT = "your_port"

client = OpenAI(
    api_key="DUMMY",
    base_url=f"http://{VAST_IP_ADDRESS}:{VAST_PORT}/v1"
)

Basic Functionality Test

start = time.time()
test_resp = client.chat.completions.create(
    model="deepseek-ai/DeepSeek-R1-0528-Qwen3-8B",
    messages=[{"role": "user", "content": "Hello"}]
)
elapsed = time.time() - start

print("Response:", test_resp.choices[0].message.content)
print(f"Elapsed time: {elapsed:.4f} s")

Reasoning Capability Test

This is where DeepSeek R1-0528 truly shines. The model naturally provides step-by-step reasoning without complex prompt engineering:

reasoning_prompt = """
Write the History of Paris in a paragraph, show your thinking 
step by step first before paragraph and then write the history 
in a paragraph
"""

start = time.time()
resp = client.chat.completions.create(
    model="deepseek-ai/DeepSeek-R1-0528-Qwen3-8B",
    messages=[{"role": "user", "content": reasoning_prompt}]
)
elapsed = time.time() - start

print("Model output:\n", resp.choices[0].message.content)
print(f"Elapsed time: {elapsed:.3f} s")

Expected output structure:

  1. Thinking process: The model shows its step-by-step reasoning
  2. Final answer: A well-structured paragraph about Paris's history

Mathematical Reasoning Test

math_prompt = "If 8 × 7 = ?, show your work step by step."

resp = client.chat.completions.create(
    model="deepseek-ai/DeepSeek-R1-0528-Qwen3-8B",
    messages=[{"role": "user", "content": math_prompt}]
)

print("Mathematical reasoning:\n", resp.choices[0].message.content)

The model will naturally break down the problem, show intermediate steps, and arrive at the solution with clear reasoning traces.


Key Benefits of DeepSeek R1-0528

1. Simplified Reasoning Access

No need to prepend thinking tokens or use complex prompt engineering. The qwen3 reasoning parser automatically handles the extraction and presentation of reasoning steps.

2. Educational Value

Perfect for educational applications where students need to understand the problem-solving process, not just the final answer.

3. Transparency and Trust

Users can see exactly how the model arrives at its conclusions, building trust in AI-powered decision-making systems.

4. Cost-Effective Deployment

Running on Vast.ai provides significant cost savings compared to major cloud providers while maintaining high performance.

5. Easy Integration

OpenAI-compatible API ensures seamless integration with existing applications and workflows.


Use Cases

Educational Applications

  • Step-by-step problem solving for mathematics, science, and logic
  • Teaching critical thinking and reasoning skills
  • Creating interactive learning experiences

Business Intelligence

  • Transparent decision-making in automated systems
  • Audit trails for AI-powered recommendations
  • Complex analysis with clear reasoning paths

Research and Development

  • Understanding model behavior and capabilities
  • Testing reasoning patterns across different domains
  • Developing more interpretable AI systems

Conclusion

The DeepSeek R1-0528 release represents a significant step forward in making AI reasoning more accessible and transparent. By combining this powerful model with Vast.ai's cost-effective GPU infrastructure and the simplified qwen3 reasoning parser, developers can now easily deploy transparent, reasoning-capable AI systems.

Whether you're building educational tools, business intelligence systems, or research applications, DeepSeek R1-0528 on Vast.ai provides an excellent foundation for applications that require clear, step-by-step reasoning capabilities.

The elimination of complex prompt engineering for accessing reasoning modes makes this model particularly valuable for production applications where transparency and interpretability are key requirements.


Useful Resources


Ready to deploy transparent, reasoning-capable AI? Get started with DeepSeek R1-0528 on Vast.ai today! 🚀

Vast AI

© 2025 Vast.ai. All rights reserved.

Vast.ai