mirror of
https://github.com/mikegerber/ocrd_calamari.git
synced 2025-07-11 02:59:55 +02:00
📝 README: Provide a complete example using real data and other processors
See #33.
This commit is contained in:
parent
f2001a79f1
commit
3416a155ec
3 changed files with 20 additions and 10 deletions
1
.gitignore
vendored
1
.gitignore
vendored
|
@ -107,5 +107,6 @@ venv.bak/
|
||||||
/calamari
|
/calamari
|
||||||
/calamari_models
|
/calamari_models
|
||||||
/gt4histocr-calamari
|
/gt4histocr-calamari
|
||||||
|
/actevedef_718448162*
|
||||||
/repo
|
/repo
|
||||||
/test/assets
|
/test/assets
|
||||||
|
|
4
Makefile
4
Makefile
|
@ -44,6 +44,10 @@ gt4histocr-calamari:
|
||||||
tar xfv model.tar.xz && \
|
tar xfv model.tar.xz && \
|
||||||
rm model.tar.xz
|
rm model.tar.xz
|
||||||
|
|
||||||
|
# Example data
|
||||||
|
actevedef_718448162:
|
||||||
|
wget https://qurator-data.de/examples/actevedef_718448162.zip && \
|
||||||
|
unzip actevedef_718448162.zip
|
||||||
|
|
||||||
|
|
||||||
# pip install calamari
|
# pip install calamari
|
||||||
|
|
25
README.md
25
README.md
|
@ -46,18 +46,23 @@ ls gt4histocr-calamari
|
||||||
```
|
```
|
||||||
|
|
||||||
## Example Usage
|
## Example Usage
|
||||||
|
Before using `ocrd-calamari-recognize` get some example data and model, and
|
||||||
|
prepare the document for OCR:
|
||||||
|
```
|
||||||
|
# Download model and example data
|
||||||
|
make gt4histocr-calamari
|
||||||
|
make actevedef_718448162
|
||||||
|
|
||||||
~~~
|
# Create binarized images and line segmentation using other OCR-D projects
|
||||||
ocrd-calamari-recognize -p test-parameters.json -m mets.xml -I OCR-D-SEG-LINE -O OCR-D-OCR-CALAMARI
|
ocrd-olena-binarize -p '{ "impl": "sauvola-ms-split" }' -I OCR-D-IMG -O OCR-D-IMG-BINPAGE,OCR-D-IMG-BIN
|
||||||
~~~
|
ocrd-tesserocr-segment-region -I OCR-D-IMG-BINPAGE -O OCR-D-SEG-REGION
|
||||||
|
ocrd-tesserocr-segment-line -I OCR-D-SEG-REGION -O OCR-D-SEG-LINE
|
||||||
|
```
|
||||||
|
|
||||||
With `test-parameters.json`:
|
Finally recognize the text using ocrd_calamari and the downloaded model:
|
||||||
~~~
|
```
|
||||||
{
|
ocrd-calamari-recognize -p '{ "checkpoint": "../gt4histocr-calamari/*.ckpt.json" }' -I OCR-D-SEG-LINE -O OCR-D-OCR-CALAMARI
|
||||||
"checkpoint": "/path/to/for/example/gt4histocr-calamari/*.ckpt.json",
|
```
|
||||||
"textequiv_level": "line"
|
|
||||||
}
|
|
||||||
~~~
|
|
||||||
|
|
||||||
You may want to have a look at the [ocrd-tool.json](ocrd_calamari/ocrd-tool.json) descriptions
|
You may want to have a look at the [ocrd-tool.json](ocrd_calamari/ocrd-tool.json) descriptions
|
||||||
for additional parameters and default values.
|
for additional parameters and default values.
|
||||||
|
|
Loading…
Add table
Add a link
Reference in a new issue