12 Commits (4fc57d7756f809e52e189b51d6b585b235d0ce7f)

Author SHA1 Message Date
wrznr 4fc57d7756 Assign page id 5 years ago
wrznr 9e9163e852 Simplify the iteration over files in the input file group 5 years ago
Mike Gerber 6e0decb5ec
Merge pull request #12 from kba/rename-tool
Rename ocrd_sbb.. to ocrd-sbb... in ocrd_cli.py, ht @bertsky
5 years ago
Gerber, Mike 5fb30a7a1f Revert "Merge branch 'master' of https://github.com/qurator-spk/sbb_textline_detector"
This reverts commit 417b9235d5, reversing
changes made to a74974b7b6.
5 years ago
Konstantin Baierer cf6381c148 Rename ocrd_sbb.. to ocrd-sbb... in ocrd_cli.py, ht @bertsky 5 years ago
Rezanezhad, Vahid 19116091f9 Update config_params.json 5 years ago
Gerber, Mike af5cbe9052 🐛 sbb_textline_detector: Fix making the output file id 5 years ago
Gerber, Mike 4aed06a325 sbb_textline_detection: Preserve input PAGE info by merging segmentation results
ocrd_sbb_textline_detection used the output XML by main.py as is, and
– by doing this – threw away any input data from the input PAGE,
including the critical pc:AlternativeImage and the less important
pc:MetadataItem.

Fix this by merging the segmentation results into a file created from
the input file.

Also add a pc:MetadataItem processingStep about the segmentation
operation.
5 years ago
Gerber, Mike 2528573b4f sbb_textline_detector: Allow PAGE input in OCR-D interface
Previous OCR-D processors may output PAGE files instead of image files.
Resolve images file from PAGE files if necessary.
5 years ago
Gerber, Mike 2199bf0d8c 🧹 sbb_textline_detector: Remove extra .xml suffix from METS file id 5 years ago
Gerber, Mike 5fd04677f9 🐛 sbb_textline_detector: Fix filenames of created OCR-D file group 5 years ago
Gerber, Mike 0c915c75de sbb_textline_detector: Add a OCR-D interface 5 years ago