Go to file
Alexander Pacha 4a97eacc56 Removed --patches flag, as the new version automatically uses patches in an efficient way.
Updated cli.py to correctly load and initialize the changed SbbBinarizer class
.circleci use resmgr for model download
repo add assets subrepo
sbb_binarize Removed --patches flag, as the new version automatically uses patches in an efficient way.
.gitignore 📦 v0.0.2
.gitkeep Add new directory, you can find corresponding models in qurator-data
.gitmodules add assets subrepo
CHANGELOG.md 📦 v0.0.10
LICENSE Add LICENSE
Makefile fix test
README.md Removed --patches flag, as the new version automatically uses patches in an efficient way.
make.sh Add new file
ocrd-tool.json add ocrd-tool.json
requirements.txt Rewrote binarization script to always use patches, but in a much more efficient way and adding support for batch-conversion with multiple GPUs.
setup.py minimal CI setup

README.md

Binarization

Binarization for document images

Examples

Introduction

This tool performs document image binarization using trained models. The method is based on Calvo-Zaragoza and Gallego, 2018.

Installation

Clone the repository, enter it and run

pip install .

Models

Pre-trained models can be downloaded from here:

https://qurator-data.de/sbb_binarization/

Usage

sbb_binarize \
  -m <path to directory containing model files> \
  <input image> \
  <output image>

Example

sbb_binarize -m /path/to/models/ myimage.tif myimage-bin.tif

To use the OCR-D interface:

ocrd-sbb-binarize --overwrite -I INPUT_FILE_GRP -O OCR-D-IMG-BIN -P model "/var/lib/sbb_binarization"