Contexts Optical Compression vision language model
vision
1xRTX 4090
Loading...
DeepSeek AI
DeepSeek
3B
8192 tokens
MIT
DeepSeek OCR is a vision-language model from DeepSeek AI that specializes in optical character recognition and document understanding. The model introduces "Contexts Optical Compression" as its core innovation, optimizing how visual information is compressed when processing text-heavy documents.
DeepSeek OCR excels at converting documents and images into structured formats, with particular emphasis on markdown conversion and raw text extraction. The model supports flexible inference modes through multiple configuration sizes (Tiny, Small, Base, Large, Gundam) that can be adjusted based on processing requirements with varying base_size and image_size parameters.
The model includes specialized grounding capabilities using grounding tokens for enhanced document understanding, making it particularly effective at maintaining context and structure during OCR operations. It employs n-gram logit processing for structured output generation, which proves especially useful for complex table extraction tasks.
Built on the Transformers framework with Safetensors format, DeepSeek OCR utilizes Flash Attention 2 for optimized performance on NVIDIA GPUs. The architecture supports custom inference parameters including crop_mode for flexible processing of various document layouts and formats. Integration with vLLM enables accelerated inference with batch processing support for production workloads.
DeepSeek OCR is designed for a wide range of document processing applications:
The model has achieved significant adoption in the community, with over 4 million downloads monthly. It is actively deployed in more than 78 community Spaces, demonstrating diverse real-world applications across document understanding tasks.
DeepSeek OCR is published under the MIT license, making it accessible for both commercial and non-commercial use.
Choose a model and click 'Deploy' above to find available GPUs recommended for this model.
Rent your dedicated instance preconfigured with the model you've selected.
Start sending requests to your model instance and getting responses right now.