|
|
|
@ -1,10 +1,13 @@
|
|
|
|
|
# ocrd_repair_inconsistencies
|
|
|
|
|
|
|
|
|
|
Automatically fix PAGE-XML order inconsistencies in regions, lines and words.
|
|
|
|
|
Child elements are only reordered if reordering by coordinates
|
|
|
|
|
top-to-bottom/left-to-right fixes the appropriately concatenated `TextEquiv`
|
|
|
|
|
texts of the children to match the parent's `TextEquiv` text. This processor
|
|
|
|
|
does not change reading order, just the order of the XML elements in the file.
|
|
|
|
|
Automatically re-order lines, words and glyphs to become textually consistent with their parents.
|
|
|
|
|
|
|
|
|
|
PAGE-XML elements with textual annotation are re-ordered by their centroid coordinates
|
|
|
|
|
in top-to-bottom/left-to-right fashion iff such re-ordering fixes the inconsistency
|
|
|
|
|
between their appropriately concatenated `TextEquiv` texts with their parent's `TextEquiv` text.
|
|
|
|
|
|
|
|
|
|
This processor does not affect `ReadingOrder` between regions, just the order of the XML elements
|
|
|
|
|
below the region level, and only if not contradicting the annotated `textLineOrder`/`readingDirection`.
|
|
|
|
|
|
|
|
|
|
We wrote this as a one-shot script to fix some files. Use with caution.
|
|
|
|
|
|
|
|
|
|