mirror of
https://github.com/qurator-spk/ocrd_repair_inconsistencies.git
synced 2025-06-09 03:40:06 +02:00
Merge pull request #1 from bertsky/patch-1
Improve description & documented steps & "no underscores"
This commit is contained in:
commit
2205a4469c
2 changed files with 13 additions and 10 deletions
13
README.md
13
README.md
|
@ -1,10 +1,13 @@
|
||||||
# ocrd_repair_inconsistencies
|
# ocrd_repair_inconsistencies
|
||||||
|
|
||||||
Automatically fix PAGE-XML order inconsistencies in regions, lines and words.
|
Automatically re-order lines, words and glyphs to become textually consistent with their parents.
|
||||||
Child elements are only reordered if reordering by coordinates
|
|
||||||
top-to-bottom/left-to-right fixes the appropriately concatenated `TextEquiv`
|
PAGE-XML elements with textual annotation are re-ordered by their centroid coordinates
|
||||||
texts of the children to match the parent's `TextEquiv` text. This processor
|
in top-to-bottom/left-to-right fashion iff such re-ordering fixes the inconsistency
|
||||||
does not change reading order, just the order of the XML elements in the file.
|
between their appropriately concatenated `TextEquiv` texts with their parent's `TextEquiv` text.
|
||||||
|
|
||||||
|
This processor does not affect `ReadingOrder` between regions, just the order of the XML elements
|
||||||
|
below the region level, and only if not contradicting the annotated `textLineOrder`/`readingDirection`.
|
||||||
|
|
||||||
We wrote this as a one-shot script to fix some files. Use with caution.
|
We wrote this as a one-shot script to fix some files. Use with caution.
|
||||||
|
|
||||||
|
|
|
@ -1,11 +1,11 @@
|
||||||
{
|
{
|
||||||
"tools": {
|
"tools": {
|
||||||
"ocrd_repair_inconsistencies": {
|
"ocrd-repair-inconsistencies": {
|
||||||
"executable": "ocrd_repair_inconsistencies",
|
"executable": "ocrd-repair-inconsistencies",
|
||||||
"categories": [
|
"categories": [
|
||||||
"Layout analysis"
|
"Layout analysis"
|
||||||
],
|
],
|
||||||
"description": "Repair glyph/word/line order inconsistencies",
|
"description": "Re-order glyphs/words/lines top-down-left-right when textually inconsistent with their parents",
|
||||||
"input_file_grp": [
|
"input_file_grp": [
|
||||||
"OCR-D-SEG-BLOCK"
|
"OCR-D-SEG-BLOCK"
|
||||||
],
|
],
|
||||||
|
@ -13,9 +13,9 @@
|
||||||
"OCR-D-SEG-BLOCK-FIXED"
|
"OCR-D-SEG-BLOCK-FIXED"
|
||||||
],
|
],
|
||||||
"steps": [
|
"steps": [
|
||||||
"layout/segmentation/region",
|
|
||||||
"layout/segmentation/line",
|
"layout/segmentation/line",
|
||||||
"layout/segmentation/words"
|
"layout/segmentation/word",
|
||||||
|
"layout/segmentation/glyph"
|
||||||
]
|
]
|
||||||
}
|
}
|
||||||
}
|
}
|
||||||
|
|
Loading…
Add table
Add a link
Reference in a new issue