📝 Document OCR-D parameters and restructure README a bit

pull/38/head
Gerber, Mike 4 years ago
parent 8b4ee20a40
commit 0f3857d8d3

@ -60,6 +60,15 @@ dinglehopper some-document.gt.page.xml some-document.ocr.alto.xml
This generates `report.html` and `report.json`.
### dinglehopper-extract
The tool `dinglehopper-extract` extracts the text of the given input file on
stdout, for example:
~~~
dinglehopper-extract --textequiv-level line OCR-D-GT-PAGE/00000024.page.xml
~~~
### OCR-D
As a OCR-D processor:
~~~
ocrd-dinglehopper -I OCR-D-GT-PAGE,OCR-D-OCR-TESS -O OCR-D-OCR-TESS-EVAL
@ -69,18 +78,18 @@ This generates HTML and JSON reports in the `OCR-D-OCR-TESS-EVAL` filegroup.
![dinglehopper displaying metrics and character differences](.screenshots/dinglehopper.png?raw=true)
You may also want to disable metrics and the green-red color scheme by
parameter:
The OCR-D processor has these parameters:
| Parameter | Meaning |
| ------------------------- | ------------------------------------------------------------------- |
| `-P metrics false` | Disable metrics and the green-red color scheme (default: enabled) |
| `-P textequiv_level line` | (PAGE) Extract text from TextLine level (default: TextRegion level) |
For example:
~~~
ocrd-dinglehopper -I ABBYY-FULLTEXT,OCR-D-OCR-CALAMARI -O OCR-D-OCR-COMPARE-ABBYY-CALAMARI -P metrics false
~~~
The tool `dinglehopper-extract` extracts the text of the given input file on
stdout, for example:
`dinglehopper-extract OCR-D-GT-PAGE/00000024.page.xml`
Developer information
---------------------
*Please refer to [README-DEV.md](README-DEV.md).*

Loading…
Cancel
Save