Merge pull request #1 from bertsky/patch-1

Improve description & documented steps & "no underscores"
pull/4/head
Mike Gerber 5 years ago committed by GitHub
commit 2205a4469c
No known key found for this signature in database
GPG Key ID: 4AEE18F83AFDEB23

@ -1,10 +1,13 @@
# ocrd_repair_inconsistencies # ocrd_repair_inconsistencies
Automatically fix PAGE-XML order inconsistencies in regions, lines and words. Automatically re-order lines, words and glyphs to become textually consistent with their parents.
Child elements are only reordered if reordering by coordinates
top-to-bottom/left-to-right fixes the appropriately concatenated `TextEquiv` PAGE-XML elements with textual annotation are re-ordered by their centroid coordinates
texts of the children to match the parent's `TextEquiv` text. This processor in top-to-bottom/left-to-right fashion iff such re-ordering fixes the inconsistency
does not change reading order, just the order of the XML elements in the file. between their appropriately concatenated `TextEquiv` texts with their parent's `TextEquiv` text.
This processor does not affect `ReadingOrder` between regions, just the order of the XML elements
below the region level, and only if not contradicting the annotated `textLineOrder`/`readingDirection`.
We wrote this as a one-shot script to fix some files. Use with caution. We wrote this as a one-shot script to fix some files. Use with caution.

@ -1,11 +1,11 @@
{ {
"tools": { "tools": {
"ocrd_repair_inconsistencies": { "ocrd-repair-inconsistencies": {
"executable": "ocrd_repair_inconsistencies", "executable": "ocrd-repair-inconsistencies",
"categories": [ "categories": [
"Layout analysis" "Layout analysis"
], ],
"description": "Repair glyph/word/line order inconsistencies", "description": "Re-order glyphs/words/lines top-down-left-right when textually inconsistent with their parents",
"input_file_grp": [ "input_file_grp": [
"OCR-D-SEG-BLOCK" "OCR-D-SEG-BLOCK"
], ],
@ -13,9 +13,9 @@
"OCR-D-SEG-BLOCK-FIXED" "OCR-D-SEG-BLOCK-FIXED"
], ],
"steps": [ "steps": [
"layout/segmentation/region",
"layout/segmentation/line", "layout/segmentation/line",
"layout/segmentation/words" "layout/segmentation/word",
"layout/segmentation/glyph"
] ]
} }
} }

Loading…
Cancel
Save