numind/NuMarkdown-8B-Thinking

NuMarkdown-8B-Thinking is the first reasoning OCR VLM. It is specifically trained to convert documents into clean Markdown files, well suited for RAG applications. It generates thinking tokens to figure out the layout of the document before generating the Markdown file. It is particularly good at understanding documents with weird layouts and complex tables. The number of thinking tokens can vary from 20% to 500% of the final answer, depending on the task difficulty.

NuMarkdown-8B-Thinking is a fine-tune of Qwen 2.5-VL-7B on synthetic Doc → Reasoning → Markdown examples, followed by an RL phase (GRPO) with a layout-centric reward.

https://huggingface.co/numind/NuMarkdown-8B-Thinking