Umi-OCR is a free, open source Optical Character Recognition (OCR) application originally developed for Windows by hiroi-sora on GitHub. Since its release, it has grown into one of the most popular open source OCR tools, especially celebrated for its exceptional Chinese text recognition accuracy.

What Makes Umi-OCR Special?

Unlike most OCR tools that depend on cloud APIs or require a subscription, Umi-OCR runs entirely offline. This means:

Privacy first — your documents never leave your machine
No API keys or rate limits — process as many documents as you need
High accuracy — particularly for Chinese, Japanese, and Korean text
Batch processing — handle hundreds of images in one session

Supported Languages

Umi-OCR supports a wide range of recognition languages including:

Simplified Chinese (极佳 accuracy)
Traditional Chinese
English
Japanese
Korean
Mixed Chinese-English documents

How Umiocr Builds on Umi-OCR

Umiocr takes the Umi-OCR recognition engine and brings it to the web. Instead of installing desktop software, users can now access the same OCR capability from any browser on any device — including smartphones, tablets, and Chromebooks.

The web version is ideal for:

Quick one-off text extractions from screenshots
Digitizing scanned PDFs without desktop software
Students and researchers working across multiple devices

Getting Started

Simply visit umiocr.com, click Launch OCR Tool in the hero section, upload your image, and extract text in seconds. No sign-up required.

What is Umi-OCR? The Open Source OCR Engine Behind Umiocr

What Makes Umi-OCR Special?

Supported Languages

How Umiocr Builds on Umi-OCR

Getting Started