mirror of
https://github.com/qurator-spk/dinglehopper.git
synced 2025-07-27 19:29:55 +02:00
📝 Document OCR-D parameters and restructure README a bit
This commit is contained in:
parent
8b4ee20a40
commit
0f3857d8d3
1 changed files with 16 additions and 7 deletions
23
README.md
23
README.md
|
@ -60,6 +60,15 @@ dinglehopper some-document.gt.page.xml some-document.ocr.alto.xml
|
||||||
This generates `report.html` and `report.json`.
|
This generates `report.html` and `report.json`.
|
||||||
|
|
||||||
|
|
||||||
|
### dinglehopper-extract
|
||||||
|
The tool `dinglehopper-extract` extracts the text of the given input file on
|
||||||
|
stdout, for example:
|
||||||
|
|
||||||
|
~~~
|
||||||
|
dinglehopper-extract --textequiv-level line OCR-D-GT-PAGE/00000024.page.xml
|
||||||
|
~~~
|
||||||
|
|
||||||
|
### OCR-D
|
||||||
As a OCR-D processor:
|
As a OCR-D processor:
|
||||||
~~~
|
~~~
|
||||||
ocrd-dinglehopper -I OCR-D-GT-PAGE,OCR-D-OCR-TESS -O OCR-D-OCR-TESS-EVAL
|
ocrd-dinglehopper -I OCR-D-GT-PAGE,OCR-D-OCR-TESS -O OCR-D-OCR-TESS-EVAL
|
||||||
|
@ -69,18 +78,18 @@ This generates HTML and JSON reports in the `OCR-D-OCR-TESS-EVAL` filegroup.
|
||||||
|
|
||||||

|

|
||||||
|
|
||||||
You may also want to disable metrics and the green-red color scheme by
|
The OCR-D processor has these parameters:
|
||||||
parameter:
|
|
||||||
|
|
||||||
|
| Parameter | Meaning |
|
||||||
|
| ------------------------- | ------------------------------------------------------------------- |
|
||||||
|
| `-P metrics false` | Disable metrics and the green-red color scheme (default: enabled) |
|
||||||
|
| `-P textequiv_level line` | (PAGE) Extract text from TextLine level (default: TextRegion level) |
|
||||||
|
|
||||||
|
For example:
|
||||||
~~~
|
~~~
|
||||||
ocrd-dinglehopper -I ABBYY-FULLTEXT,OCR-D-OCR-CALAMARI -O OCR-D-OCR-COMPARE-ABBYY-CALAMARI -P metrics false
|
ocrd-dinglehopper -I ABBYY-FULLTEXT,OCR-D-OCR-CALAMARI -O OCR-D-OCR-COMPARE-ABBYY-CALAMARI -P metrics false
|
||||||
~~~
|
~~~
|
||||||
|
|
||||||
The tool `dinglehopper-extract` extracts the text of the given input file on
|
|
||||||
stdout, for example:
|
|
||||||
|
|
||||||
`dinglehopper-extract OCR-D-GT-PAGE/00000024.page.xml`
|
|
||||||
|
|
||||||
Developer information
|
Developer information
|
||||||
---------------------
|
---------------------
|
||||||
*Please refer to [README-DEV.md](README-DEV.md).*
|
*Please refer to [README-DEV.md](README-DEV.md).*
|
||||||
|
|
Loading…
Add table
Add a link
Reference in a new issue