5 Commits (4aed06a325bf7d172612198ae1b5fa00ea723b0d)

Author SHA1 Message Date
Gerber, Mike 4aed06a325 sbb_textline_detection: Preserve input PAGE info by merging segmentation results
ocrd_sbb_textline_detection used the output XML by main.py as is, and
– by doing this – threw away any input data from the input PAGE,
including the critical pc:AlternativeImage and the less important
pc:MetadataItem.

Fix this by merging the segmentation results into a file created from
the input file.

Also add a pc:MetadataItem processingStep about the segmentation
operation.
5 years ago
Gerber, Mike 2528573b4f sbb_textline_detector: Allow PAGE input in OCR-D interface
Previous OCR-D processors may output PAGE files instead of image files.
Resolve images file from PAGE files if necessary.
5 years ago
Gerber, Mike 2199bf0d8c 🧹 sbb_textline_detector: Remove extra .xml suffix from METS file id 5 years ago
Gerber, Mike 5fd04677f9 🐛 sbb_textline_detector: Fix filenames of created OCR-D file group 5 years ago
Gerber, Mike 0c915c75de sbb_textline_detector: Add a OCR-D interface 5 years ago