You cannot select more than 25 topics
Topics must start with a letter or number, can include dashes ('-') and can be up to 35 characters long.
Gerber, Mike 1b73c3c23e | 5 years ago | |
---|---|---|
qurator | 5 years ago | |
.gitkeep | 5 years ago | |
Dockerfile | 5 years ago | |
README.md | 5 years ago | |
ocrd-tool.json | 5 years ago | |
requirements.txt | 5 years ago | |
setup.py | 5 years ago |
README.md
Textline-Recognition
Tool
This tool does textline detection of image and throw result as xml data.
Models
In order to run this tool you need corresponding models. You can find them here:
https://file.spk-berlin.de:8443/textline_detection/
Installation
sudo pip install .
Usage
sbb_textline_detector -i 'image file name' -o 'directory to write output xml' -m 'directory of models'
Usage with OCR-D
ocrd-example-binarize -I OCR-D-IMG -O OCR-D-IMG-BIN
ocrd_sbb_textline_detector -I OCR-D-IMG-BIN -O OCR-D-SEG-LINE-SBB \
-p '{ "model": "/path/to/the/models/textline_detection" }'
Segmentation works on raw RGB images, but respects and retains
AlternativeImage
s from binarization steps, so it's a good idea to do
binarization first, then perform the textline detection. The used binarization
processor must produce an AlternativeImage
for the binarized image, not
replace the original raw RGB image.