You cannot select more than 25 topics
Topics must start with a letter or number, can include dashes ('-') and can be up to 35 characters long.
Gerber, Mike
4aed06a325
ocrd_sbb_textline_detection used the output XML by main.py as is, and – by doing this – threw away any input data from the input PAGE, including the critical pc:AlternativeImage and the less important pc:MetadataItem. Fix this by merging the segmentation results into a file created from the input file. Also add a pc:MetadataItem processingStep about the segmentation operation. |
5 years ago | |
---|---|---|
qurator | 5 years ago | |
.gitkeep | 5 years ago | |
Dockerfile | 5 years ago | |
README.md | 5 years ago | |
ocrd-tool.json | 5 years ago | |
requirements.txt | 5 years ago | |
setup.py | 5 years ago |
README.md
Textline-Recognition
Installation:
Setup virtual environment:
virtualenv --python=python3.6 venv
Activate virtual environment:
source venv/bin/activate
Upgrade pip:
pip install -U pip
Install package together with its dependencies in development mode:
pip install -e ./
Perform document structure and textline analysis on a scanned document image and save the result as PAGE XML.
Usage
text_line_recognition --help