mirror of
https://github.com/mikegerber/ocrd_calamari.git
synced 2025-08-08 16:49:54 +02:00
📝 README: Add information about the new glyph and word segmentation
This commit is contained in:
parent
2650189910
commit
0a572df0ba
1 changed files with 8 additions and 0 deletions
|
@ -13,6 +13,14 @@ This offers a OCR-D compliant workspace processor for some of the functionality
|
||||||
This processor only operates on the text line level and so needs a line segmentation (and by extension a binarized
|
This processor only operates on the text line level and so needs a line segmentation (and by extension a binarized
|
||||||
image) as its input.
|
image) as its input.
|
||||||
|
|
||||||
|
In addition to the line text it also outputs glyph segmentation including
|
||||||
|
per-glyph confidence values and per-glyph alternative predictions as provided
|
||||||
|
by the Calamari OCR engine. Note that while Calamari does not provide word
|
||||||
|
segmentation, this processor produces word segmentation inferred from Unicode
|
||||||
|
text segmentation and the glyph positions. The provided glyph and word
|
||||||
|
segmentation can be used for text extraction and highlighting, but is probably
|
||||||
|
not useful for further image-based processing.
|
||||||
|
|
||||||
## Installation
|
## Installation
|
||||||
|
|
||||||
### From PyPI
|
### From PyPI
|
||||||
|
|
Loading…
Add table
Add a link
Reference in a new issue