Commit Graph

75 Commits (694af0f903ba9fa4d7596ff9fe9e0a3e17aa06f9)

Author SHA1 Message Date
Konstantin Baierer 5c7fd26883 processor: self.resolve_resource model
vahidrezanezhad c241830b4b
Merge pull request from bertsky/fix-coords
ensure valid coordinates by intersection with parent…
vahidrezanezhad 4c498fcad2
resolving issue https://github.com/qurator-spk/sbb_textline_detection/issues/53
Robert Sachunsky 261db14ec3 ensure valid coordinates by intersection with parent…
- Border: intersect with page frame
- text regions: intersect with (new) Border
- text lines: intersect with (new) text region
  (and back-transform at all)
Gerber, Mike 665b739fb8 🐛 sbb_textline_detector: Re-base Border coords too
Gerber, Mike 006c7765b0 🐛 sbb_textline_detector: Filter cropped images (OCR-D)
Gerber, Mike 37cc513ce9 🚧 sbb_textline_detector: Translate detected coordinates
Gerber, Mike a9b9c8a885 🚧 sbb_textline_detector: Get image via image_from_page
Konstantin Baierer f167f6768c getLogger per method
vahidrezanezhad 0f09f4a1f6
Update main.py
Issues 30 and 40 are resolved
Clemens Neudecker e4798c6811
replace 'PrintSpace' with 'Border'
Clemens Neudecker 36adbe29d8
replace 'PrintSpace' with 'Border'
Konstantin Baierer 05deb03ec8 use make_file_id and assert_file_grp_cardinality
Gerber, Mike 8b01d9e671 🐛 sbb_textline_detection: Set pcGtsId
Newest OCR-D workspace validation requires that the pcGtsId of a
PAGE-XML file matches its METS mets:file/ID. Fix this by setting
it correctly.
Mike Gerber 3593506e72
🔧 ocrd-tool.json: Update description, steps and categories
Fixes .
Lucas Sulzbach ead1eae114 ocrd-tool.json: Make description OCR-D compliant
vahidrezanezhad f94944ee80
change scaling
b-vr103 b9caa8e12c resolve 2020-02-17-bug-sbb_textline_detector
b-vr103 1446d7c662 getting robust and doing sth for verticals
b-vr103 3941f2f17d gettin robust and doing sth for verticals
Gerber, Mike f90b3cfa86 🔊 sbb_textline_detector: In OCR-D interface, warn if overwriting existing segmentation
Gerber, Mike 11c0e9cee5 🐛 sbb_textline_detector: Do not print PAGE output to stdout
ocrd-sbb-textline-detector uses ocrd_page's parse() to parse XML input,
which writes the XML to stdout by default.

Fix this by silencing it using parse()'s silence=True.
wrznr 4fc57d7756 Assign page id
wrznr 9e9163e852 Simplify the iteration over files in the input file group
Mike Gerber 6e0decb5ec
Merge pull request from kba/rename-tool
Rename ocrd_sbb.. to ocrd-sbb... in ocrd_cli.py, ht @bertsky
Gerber, Mike 5fb30a7a1f Revert "Merge branch 'master' of https://github.com/qurator-spk/sbb_textline_detector"
This reverts commit 417b9235d5, reversing
changes made to a74974b7b6.
Konstantin Baierer cf6381c148 Rename ocrd_sbb.. to ocrd-sbb... in ocrd_cli.py, ht @bertsky
Clemens Neudecker 51e241fd84
Merge pull request from cneud/cneud-fix-typos
Fix typos
Clemens Neudecker 12c07f389d
Merge pull request from cneud/cneud-fix-docstring
fix docstring
Clemens Neudecker 29870f26e1
Merge pull request from cneud/cneud-PAGE2019
PAGE2019
Konstantin Baierer b6ca1a7c53 kebab-case snake_case executable, fix
Clemens Neudecker 6c0bfba686
fix typos
Clemens Neudecker c8bc468628
fix docstring
Clemens Neudecker e696a068cb
Fix typos
Clemens Neudecker d90dad48fd
PAGE2019
Rezanezhad, Vahid 19116091f9 Update config_params.json
Gerber, Mike af5cbe9052 🐛 sbb_textline_detector: Fix making the output file id
Rezanezhad, Vahid 2112bb18c6 fixed the bug: local variable 't4' referenced before assignment
Rezanezhad, Vahid a11f6740cb Update main.py - robust deskewing and better page extraction
Rezanezhad, Vahid 0182b7087f remove multiprocessing bug
Gerber, Mike 8fa7179560 🐛 sbb_textline_detector: Disable multiprocessing to fix race condition
Lines were sorted in the wrong regions. Work around this by disabling
multiprocessing until a proper fix is done.
Gerber, Mike 4aed06a325 sbb_textline_detection: Preserve input PAGE info by merging segmentation results
ocrd_sbb_textline_detection used the output XML by main.py as is, and
– by doing this – threw away any input data from the input PAGE,
including the critical pc:AlternativeImage and the less important
pc:MetadataItem.

Fix this by merging the segmentation results into a file created from
the input file.

Also add a pc:MetadataItem processingStep about the segmentation
operation.
Gerber, Mike 4fb3e70ef6 🧹 sbb_textline_detector: Do not create empty/space-only TextEquivs (again)
Gerber, Mike bf41a29e7b 🐛 sbb_textline_detector: Do not hardcode Created/LastChange elements
Gerber, Mike fbd21cdb81 🧹 sbb_textline_detector: Do not create empty/space-only TextEquivs (again)
Rezanezhad, Vahid 2d6dd92b31 Update main.py
Rezanezhad, Vahid 9f97f34255 Update main.py
Rezanezhad, Vahid 8c954a6c7a Update main.py
Rezanezhad, Vahid 6714481556 Update main.py
Rezanezhad, Vahid 719824f19d Update main.py