1
0
Fork 0
mirror of https://github.com/mikegerber/ocrd_calamari.git synced 2025-10-20 19:44:12 +02:00

📝 README: Provide a complete example using real data and other processors

See #33.
This commit is contained in:
Gerber, Mike 2020-02-05 17:39:37 +01:00
parent f2001a79f1
commit 3416a155ec
3 changed files with 20 additions and 10 deletions

1
.gitignore vendored
View file

@ -107,5 +107,6 @@ venv.bak/
/calamari /calamari
/calamari_models /calamari_models
/gt4histocr-calamari /gt4histocr-calamari
/actevedef_718448162*
/repo /repo
/test/assets /test/assets

View file

@ -44,6 +44,10 @@ gt4histocr-calamari:
tar xfv model.tar.xz && \ tar xfv model.tar.xz && \
rm model.tar.xz rm model.tar.xz
# Example data
actevedef_718448162:
wget https://qurator-data.de/examples/actevedef_718448162.zip && \
unzip actevedef_718448162.zip
# pip install calamari # pip install calamari

View file

@ -46,18 +46,23 @@ ls gt4histocr-calamari
``` ```
## Example Usage ## Example Usage
Before using `ocrd-calamari-recognize` get some example data and model, and
prepare the document for OCR:
```
# Download model and example data
make gt4histocr-calamari
make actevedef_718448162
~~~ # Create binarized images and line segmentation using other OCR-D projects
ocrd-calamari-recognize -p test-parameters.json -m mets.xml -I OCR-D-SEG-LINE -O OCR-D-OCR-CALAMARI ocrd-olena-binarize -p '{ "impl": "sauvola-ms-split" }' -I OCR-D-IMG -O OCR-D-IMG-BINPAGE,OCR-D-IMG-BIN
~~~ ocrd-tesserocr-segment-region -I OCR-D-IMG-BINPAGE -O OCR-D-SEG-REGION
ocrd-tesserocr-segment-line -I OCR-D-SEG-REGION -O OCR-D-SEG-LINE
```
With `test-parameters.json`: Finally recognize the text using ocrd_calamari and the downloaded model:
~~~ ```
{ ocrd-calamari-recognize -p '{ "checkpoint": "../gt4histocr-calamari/*.ckpt.json" }' -I OCR-D-SEG-LINE -O OCR-D-OCR-CALAMARI
"checkpoint": "/path/to/for/example/gt4histocr-calamari/*.ckpt.json", ```
"textequiv_level": "line"
}
~~~
You may want to have a look at the [ocrd-tool.json](ocrd_calamari/ocrd-tool.json) descriptions You may want to have a look at the [ocrd-tool.json](ocrd_calamari/ocrd-tool.json) descriptions
for additional parameters and default values. for additional parameters and default values.