mirror of
https://github.com/qurator-spk/dinglehopper.git
synced 2025-06-17 23:59:59 +02:00
📝 Document CER/WER and the format detection (Fixes GH-26)
This commit is contained in:
parent
da47e41c85
commit
d706ef4621
2 changed files with 10 additions and 2 deletions
|
@ -31,13 +31,17 @@ Usage: dinglehopper [OPTIONS] GT OCR [REPORT_PREFIX]
|
|||
|
||||
Compare the PAGE/ALTO/text document GT against the document OCR.
|
||||
|
||||
dinglehopper detects if GT/OCR are ALTO or PAGE XML documents to extract
|
||||
their text and falls back to plain text if no ALTO or PAGE is detected.
|
||||
|
||||
The files GT and OCR are usually a ground truth document and the result of
|
||||
an OCR software, but you may use dinglehopper to compare two OCR results.
|
||||
In that case, use --no-metrics to disable the then meaningless metrics and
|
||||
also change the color scheme from green/red to blue.
|
||||
|
||||
The comparison report will be written to $REPORT_PREFIX.{html,json}, where
|
||||
$REPORT_PREFIX defaults to "report".
|
||||
$REPORT_PREFIX defaults to "report". The reports include the character
|
||||
error rate (CER) and the word error rate (WER).
|
||||
|
||||
Options:
|
||||
--metrics / --no-metrics Enable/disable metrics and green/red
|
||||
|
|
Loading…
Add table
Add a link
Reference in a new issue