You cannot select more than 25 topics Topics must start with a letter or number, can include dashes ('-') and can be up to 35 characters long.
dinglehopper/README.md

854 B

dinglehopper

dinglehopper is an OCR evaluation tool and reads ALTO, PAGE and text files.

Build Status

Goals

  • Useful
    • As a UI tool
    • For an automated evaluation
    • As a library
  • Unicode support

Usage

dinglehopper some-document.gt.page.xml some-document.ocr.alto.xml

This generates report.html and report.json.

As a OCR-D processor:

ocrd-dinglehopper -m mets.xml -I OCR-D-GT-PAGE,OCR-D-OCR-TESS -O OCR-D-OCR-TESS-EVAL

This generates HTML and JSON reports in the OCR-D-OCR-TESS-EVAL filegroup.

dinglehopper displaying metrics and character differences