From 56ccb39539d17ceedbfc6e69d6816971ac3c57e8 Mon Sep 17 00:00:00 2001 From: Clemens Neudecker <952378+cneud@users.noreply.github.com> Date: Tue, 11 Oct 2022 11:37:00 +0200 Subject: [PATCH] Update README * recommend cropping (fix #49) * document huggingface saved_model --- README.md | 14 ++++++++++---- 1 file changed, 10 insertions(+), 4 deletions(-) diff --git a/README.md b/README.md index d5cff03..d1f5c8e 100644 --- a/README.md +++ b/README.md @@ -8,7 +8,7 @@ ## Introduction -This tool performs document image binarization using trained models. The method is based on [Calvo-Zaragoza and Gallego, 2018](https://arxiv.org/abs/1706.10241). +This tool performs document image binarization using a trained ResNet50-UNet model. ## Installation @@ -18,10 +18,14 @@ Clone the repository, enter it and run ### Models -Pre-trained models can be downloaded from here: +Pre-trained models in `h5` format can be downloaded from here: https://qurator-data.de/sbb_binarization/ +We also provide a Tensorflow `saved_model` via Huggingface: + +https://huggingface.co/SBB/sbb_binarization + ## Usage ```sh @@ -32,9 +36,11 @@ sbb_binarize \ ``` -**Note** In virtually all cases, applying the `--patches` flag will improve the quality of results. +In virtually all cases, applying the `--patches` flag will improve the quality of results. + +Images containing a lot of border noise (black pixels) should be cropped beforehand to improve the quality of results. -Example +### Example ```sh sbb_binarize --patches -m /path/to/models/ myimage.tif myimage-bin.tif