sbb_textline_detection: Preserve input PAGE info by merging segmentation results

ocrd_sbb_textline_detection used the output XML by main.py as is, and
– by doing this – threw away any input data from the input PAGE,
including the critical pc:AlternativeImage and the less important
pc:MetadataItem.

Fix this by merging the segmentation results into a file created from
the input file.

Also add a pc:MetadataItem processingStep about the segmentation
operation.
This commit is contained in:
Gerber, Mike 2019-11-19 15:08:53 +01:00
parent 4fb3e70ef6
commit 4aed06a325
2 changed files with 44 additions and 8 deletions

View file

@ -9,5 +9,4 @@ scikit-learn
tensorflow-gpu < 2.0
scipy
click
ocrd >= 1.0.0b19
ocrd >= 2.0.0