mirror of
https://github.com/qurator-spk/dinglehopper.git
synced 2025-06-17 23:59:59 +02:00
📝 Document OCR-D parameters and restructure README a bit
This commit is contained in:
parent
8b4ee20a40
commit
0f3857d8d3
1 changed files with 16 additions and 7 deletions
23
README.md
23
README.md
|
@ -60,6 +60,15 @@ dinglehopper some-document.gt.page.xml some-document.ocr.alto.xml
|
|||
This generates `report.html` and `report.json`.
|
||||
|
||||
|
||||
### dinglehopper-extract
|
||||
The tool `dinglehopper-extract` extracts the text of the given input file on
|
||||
stdout, for example:
|
||||
|
||||
~~~
|
||||
dinglehopper-extract --textequiv-level line OCR-D-GT-PAGE/00000024.page.xml
|
||||
~~~
|
||||
|
||||
### OCR-D
|
||||
As a OCR-D processor:
|
||||
~~~
|
||||
ocrd-dinglehopper -I OCR-D-GT-PAGE,OCR-D-OCR-TESS -O OCR-D-OCR-TESS-EVAL
|
||||
|
@ -69,18 +78,18 @@ This generates HTML and JSON reports in the `OCR-D-OCR-TESS-EVAL` filegroup.
|
|||
|
||||

|
||||
|
||||
You may also want to disable metrics and the green-red color scheme by
|
||||
parameter:
|
||||
The OCR-D processor has these parameters:
|
||||
|
||||
| Parameter | Meaning |
|
||||
| ------------------------- | ------------------------------------------------------------------- |
|
||||
| `-P metrics false` | Disable metrics and the green-red color scheme (default: enabled) |
|
||||
| `-P textequiv_level line` | (PAGE) Extract text from TextLine level (default: TextRegion level) |
|
||||
|
||||
For example:
|
||||
~~~
|
||||
ocrd-dinglehopper -I ABBYY-FULLTEXT,OCR-D-OCR-CALAMARI -O OCR-D-OCR-COMPARE-ABBYY-CALAMARI -P metrics false
|
||||
~~~
|
||||
|
||||
The tool `dinglehopper-extract` extracts the text of the given input file on
|
||||
stdout, for example:
|
||||
|
||||
`dinglehopper-extract OCR-D-GT-PAGE/00000024.page.xml`
|
||||
|
||||
Developer information
|
||||
---------------------
|
||||
*Please refer to [README-DEV.md](README-DEV.md).*
|
||||
|
|
Loading…
Add table
Add a link
Reference in a new issue