📝 README: Add information about the new glyph and word segmentation

2025-06-26 20:19:53 +02:00 · 2020-02-03 15:31:36 +01:00 · 2020-02-03 15:31:36 +01:00 · 0a572df0ba
commit 0a572df0ba
parent 2650189910
1 changed files with 8 additions and 0 deletions
--- a/README.md
+++ b/README.md
@ -13,6 +13,14 @@ This offers a OCR-D compliant workspace processor for some of the functionality
 This processor only operates on the text line level and so needs a line segmentation (and by extension a binarized 
 image) as its input.

+In addition to the line text it also outputs glyph segmentation including
+per-glyph confidence values and per-glyph alternative predictions as provided
+by the Calamari OCR engine. Note that while Calamari does not provide word
+segmentation, this processor produces word segmentation inferred from Unicode
+text segmentation and the glyph positions. The provided glyph and word
+segmentation can be used for text extraction and highlighting, but is probably
+not useful for further image-based processing.
+
 ## Installation

 ### From PyPI