mirror of
https://github.com/qurator-spk/eynollah.git
synced 2025-11-09 22:24:13 +01:00
Update README.md
This commit is contained in:
parent
46a45f6b0e
commit
f6c0f56348
1 changed files with 14 additions and 22 deletions
36
README.md
36
README.md
|
|
@ -12,7 +12,7 @@
|
||||||

|

|
||||||
|
|
||||||
## Features
|
## Features
|
||||||
* Document layout analysis using pixelwise segmentation models with support for 10 distinct segmentation classes:
|
* Document layout analysis using pixelwise segmentation models with support for 10 segmentation classes:
|
||||||
* background, [page border](https://ocr-d.de/en/gt-guidelines/trans/lyRand.html), [text region](https://ocr-d.de/en/gt-guidelines/trans/lytextregion.html#textregionen__textregion_), [text line](https://ocr-d.de/en/gt-guidelines/pagexml/pagecontent_xsd_Complex_Type_pc_TextLineType.html), [header](https://ocr-d.de/en/gt-guidelines/trans/lyUeberschrift.html), [image](https://ocr-d.de/en/gt-guidelines/trans/lyBildbereiche.html), [separator](https://ocr-d.de/en/gt-guidelines/trans/lySeparatoren.html), [marginalia](https://ocr-d.de/en/gt-guidelines/trans/lyMarginalie.html), [initial](https://ocr-d.de/en/gt-guidelines/trans/lyInitiale.html), [table](https://ocr-d.de/en/gt-guidelines/trans/lyTabellen.html)
|
* background, [page border](https://ocr-d.de/en/gt-guidelines/trans/lyRand.html), [text region](https://ocr-d.de/en/gt-guidelines/trans/lytextregion.html#textregionen__textregion_), [text line](https://ocr-d.de/en/gt-guidelines/pagexml/pagecontent_xsd_Complex_Type_pc_TextLineType.html), [header](https://ocr-d.de/en/gt-guidelines/trans/lyUeberschrift.html), [image](https://ocr-d.de/en/gt-guidelines/trans/lyBildbereiche.html), [separator](https://ocr-d.de/en/gt-guidelines/trans/lySeparatoren.html), [marginalia](https://ocr-d.de/en/gt-guidelines/trans/lyMarginalie.html), [initial](https://ocr-d.de/en/gt-guidelines/trans/lyInitiale.html), [table](https://ocr-d.de/en/gt-guidelines/trans/lyTabellen.html)
|
||||||
* Textline segmentation to bounding boxes or polygons (contours) including for curved lines and vertical text
|
* Textline segmentation to bounding boxes or polygons (contours) including for curved lines and vertical text
|
||||||
* Document image binarization with pixelwise segmentation or hybrid CNN-Transformer models
|
* Document image binarization with pixelwise segmentation or hybrid CNN-Transformer models
|
||||||
|
|
@ -81,6 +81,8 @@ Eynollah supports five use cases:
|
||||||
4. [text recognition (OCR)](#ocr), and
|
4. [text recognition (OCR)](#ocr), and
|
||||||
5. [reading order detection](#reading-order-detection).
|
5. [reading order detection](#reading-order-detection).
|
||||||
|
|
||||||
|
Some example outputs can be found in [`examples.md`](https://github.com/qurator-spk/eynollah/tree/main/docs/examples.md).
|
||||||
|
|
||||||
### Layout Analysis
|
### Layout Analysis
|
||||||
|
|
||||||
The layout analysis module is responsible for detecting layout elements, identifying text lines, and determining reading
|
The layout analysis module is responsible for detecting layout elements, identifying text lines, and determining reading
|
||||||
|
|
@ -152,16 +154,6 @@ TODO
|
||||||
|
|
||||||
### OCR
|
### OCR
|
||||||
|
|
||||||
<p align="center">
|
|
||||||
<img src="https://github.com/user-attachments/assets/71054636-51c6-4117-b3cf-361c5cda3528" alt="Input Image" width="45%">
|
|
||||||
<img src="https://github.com/user-attachments/assets/cfb3ce38-007a-4037-b547-21324a7d56dd" alt="Output Image" width="45%">
|
|
||||||
</p>
|
|
||||||
|
|
||||||
<p align="center">
|
|
||||||
<img src="https://github.com/user-attachments/assets/343b2ed8-d818-4d4a-b301-f304cbbebfcd" alt="Input Image" width="45%">
|
|
||||||
<img src="https://github.com/user-attachments/assets/accb5ba7-e37f-477e-84aa-92eafa0d136e" alt="Output Image" width="45%">
|
|
||||||
</p>
|
|
||||||
|
|
||||||
The OCR module performs text recognition using either a CNN-RNN model or a Transformer model.
|
The OCR module performs text recognition using either a CNN-RNN model or a Transformer model.
|
||||||
|
|
||||||
The command-line interface for OCR can be called like this:
|
The command-line interface for OCR can be called like this:
|
||||||
|
|
@ -176,17 +168,17 @@ eynollah ocr \
|
||||||
|
|
||||||
The following options can be used to further configure the ocr processing:
|
The following options can be used to further configure the ocr processing:
|
||||||
|
|
||||||
| option | description |
|
| option | description |
|
||||||
|-------------------|:------------------------------------------------------------------------------- |
|
|-------------------|:-------------------------------------------------------------------------------------------|
|
||||||
| `-dib` | directory of bins(files type must be '.png'). Prediction with both RGB and bins. |
|
| `-dib` | directory of binarized images (file type must be '.png'), prediction with both RGB and bin |
|
||||||
| `-doit` | Directory containing output images rendered with the predicted text |
|
| `-doit` | directory for output images rendered with the predicted text |
|
||||||
| `--model_name` | Specific model file path to use for OCR |
|
| `--model_name` | file path to use specific model for OCR |
|
||||||
| `-trocr` | transformer ocr will be applied, otherwise cnn_rnn model |
|
| `-trocr` | use transformer ocr model (otherwise cnn_rnn model is used) |
|
||||||
| `-etit` | textlines images and text in xml will be exported into output dir (OCR training data) |
|
| `-etit` | export textline images and text in xml to output dir (OCR training data) |
|
||||||
| `-nmtc` | cropped textline images will not be masked with textline contour |
|
| `-nmtc` | cropped textline images will not be masked with textline contour |
|
||||||
| `-bs` | ocr inference batch size. Default bs for trocr and cnn_rnn models are 2 and 8 respectively |
|
| `-bs` | ocr inference batch size. Default batch size is 2 for trocr and 8 for cnn_rnn models |
|
||||||
| `-ds_pref` | add an abbrevation of dataset name to generated training data |
|
| `-ds_pref` | add an abbrevation of dataset name to generated training data |
|
||||||
| `-min_conf` | minimum OCR confidence value. OCRs with textline conf lower than this will be ignored |
|
| `-min_conf` | minimum OCR confidence value. OCR with textline conf lower than this will be ignored |
|
||||||
|
|
||||||
|
|
||||||
### Reading Order Detection
|
### Reading Order Detection
|
||||||
|
|
|
||||||
Loading…
Add table
Add a link
Reference in a new issue