No description
Find a file
2019-12-06 19:44:23 +01:00
qurator Merge pull request #4 from cneud/cneud-PAGE2019 2019-12-06 19:44:04 +01:00
.gitignore kebab-case snake_case executable, fix #9 2019-12-06 18:26:09 +01:00
.gitkeep Update config_params.json 2019-12-05 14:05:55 +01:00
Dockerfile Update config_params.json 2019-12-05 14:05:55 +01:00
LICENSE Create LICENSE 2019-12-05 22:01:47 +01:00
ocrd-tool.json Update config_params.json 2019-12-05 14:05:55 +01:00
README.md 📝 sbb_textline_detector: Break long line for ocrd_sbb_textline_detector example 2019-12-06 12:34:15 +01:00
requirements.txt ocrd implies click 2019-12-06 19:03:10 +01:00
setup.py kebab-case snake_case executable, fix #9 2019-12-06 18:26:09 +01:00

Textline-Recognition


Tool

This tool does textline detection of image and throw result as xml data.

Models

In order to run this tool you need corresponding models. You can find them here:

https://file.spk-berlin.de:8443/textline_detection/

Installation

sudo pip install .

Usage

sbb_textline_detector -i 'image file name' -o 'directory to write output xml' -m 'directory of models'

Usage with OCR-D

ocrd-example-binarize -I OCR-D-IMG -O OCR-D-IMG-BIN
ocrd_sbb_textline_detector -I OCR-D-IMG-BIN -O OCR-D-SEG-LINE-SBB \
        -p '{ "model": "/path/to/the/models/textline_detection" }'

Segmentation works on raw RGB images, but respects and retains AlternativeImages from binarization steps, so it's a good idea to do binarization first, then perform the textline detection. The used binarization processor must produce an AlternativeImage for the binarized image, not replace the original raw RGB image.