Skip to content

Instantly share code, notes, and snippets.

@hoonsubin
Last active February 25, 2026 16:22
Show Gist options
  • Select an option

  • Save hoonsubin/e492921dcc86d6e3e3014ca926e16ad3 to your computer and use it in GitHub Desktop.

Select an option

Save hoonsubin/e492921dcc86d6e3e3014ca926e16ad3 to your computer and use it in GitHub Desktop.
Ollama model card for obsidian web clipper
{
"schemaVersion": "0.1.0",
"name": "Ollama Model Card",
"behavior": "create",
"noteContentFormat": "# {{title}}\n\n**Model Summary**\n\n{{selectorHtml:#summary-content|markdown|callout}}\n\n**{{selectorHtml:div.use-panel?data-panel|first|markdown}} usage**\n\n```sh\n{{selectorHtml:div.use-panel > pre|first|markdown}}\n```\n\n**Models**\n\n{{selector:div.hidden.group a.font-medium, div.hidden.group p.col-span-2.text-neutral-500|map:v => v.trim()|table:(\"Name\",\"Size\",\"Context\",\"Input\")}}\n\n{{selectorHtml:section.flex-col > div > a|markdown}}\n\n## Description\n\n{{selectorHtml:div#readme > div > div#display|markdown}}\n",
"properties": [
{
"name": "title",
"value": "{{title}}",
"type": "text"
},
{
"name": "clipped_date",
"value": "{{time}}",
"type": "date"
},
{
"name": "source",
"value": "{{url}}",
"type": "text"
},
{
"name": "description",
"value": "{{description}}",
"type": "text"
},
{
"name": "model_tags",
"value": "{{selector:.inline-flex.items-center.rounded-md}}",
"type": "multitext"
}
],
"triggers": [
"https://ollama.com/library"
],
"noteNameFormat": "{{title}}",
"path": "Imports/ollama"
}
@hoonsubin
Copy link
Author

example output:


title: "glm-ocr"
clipped_date: 2026-02-25
source: "https://ollama.com/library/glm-ocr"
description: "GLM-OCR is a multimodal OCR model for complex document understanding, built on the GLM-V encoder–decoder architecture."
cover_img: "https://ollama.com/public/og.png"
skills: "["vision","tools"]"

glm-ocr

Model Summary

[!info]
GLM-OCR is a multimodal OCR model for complex document understanding, built on the GLM-V encoder–decoder architecture.

cli usage

ollama run glm-ocr

Models

Name Size Context Input
glm-ocr:latest 2.2GB 128K Text, Image
glm-ocr:q8_0 1.6GB 128K Text, Image
glm-ocr:bf16 2.2GB 128K Text, Image

View all →

Description

logo.svg

GLM-OCR is a multimodal OCR model for complex document understanding, built on the GLM-V encoder–decoder architecture. The model integrates the CogViT visual encoder pre-trained on large-scale image–text data, a lightweight cross-modal connector with efficient token downsampling, and a GLM-0.5B language decoder.

image.png

Usage

Text recognition

ollama run glm-ocr Text Recognition: ./image.png

Table recognition

ollama run glm-ocr Table Recognition: ./image.png

Figure recognition

ollama run glm-ocr Figure Recognition: ./image.png

Key features

  • State-of-the-Art Performance: Achieves a score of 94.62 on OmniDocBench V1.5, ranking #1 overall, and delivers state-of-the-art results across major document understanding benchmarks, including formula recognition, table recognition, and information extraction.
  • Optimized for Real-World Scenarios: Designed and optimized for practical business use cases, maintaining robust performance on complex tables, code-heavy documents, seals, and other challenging real-world layouts.
  • Efficient Inference: With only 0.9B parameters, GLM-OCR supports deployment via vLLM, SGLang, and Ollama, significantly reducing inference latency and compute cost, making it ideal for high-concurrency services and edge deployments.
  • Easy to Use: Fully open-sourced and equipped with a comprehensive SDK and inference toolchain, offering simple installation, one-line invocation, and smooth integration into existing production pipelines.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment