Robert Sachunsky
bcec0c4a55
Merge 9801129aa6 into 1df32eba87
2026-05-22 10:38:11 +00:00
Robert Sachunsky
9801129aa6
estimate_skew_contours: ensure retval is always float
2026-05-22 12:37:07 +02:00
Robert Sachunsky
26afc5ddab
ModelZoo: ensure exported TensorShape is converted to plain tuple
2026-05-22 12:35:44 +02:00
Robert Sachunsky
0836230c6b
utils_ocr: avoid module-level import of TF
2026-05-21 22:50:53 +02:00
Robert Sachunsky
f3a93983c0
ModelZoo: add ocr key for memory_limit
2026-05-21 22:50:13 +02:00
Robert Sachunsky
ea41dcae1d
trocr: use beam search instead of greedy decoding
2026-05-21 17:52:27 +02:00
Robert Sachunsky
074753a98e
ModelZoo: fix Torch device selection
2026-05-21 17:25:53 +02:00
Robert Sachunsky
000e4ac8d8
trocr: extract confidence, too
2026-05-21 17:25:39 +02:00
Robert Sachunsky
f3649adbf2
trocr: apply do_not_mask_with_textline_contour here, too
2026-05-21 17:23:11 +02:00
Robert Sachunsky
1d67e65f11
trocr: simplify, batch over entire page…
...
- batching over entire page instead of region-wise
(underfilling batches)
- avoid copied redundant code
2026-05-21 15:48:21 +02:00
Robert Sachunsky
d50bd7c650
trocr: avoid warnings by passing clean_up_tokenization_spaces=False
2026-05-21 14:20:51 +02:00
Robert Sachunsky
f9f9130dbb
do_order_of_regions: remove redundant+overcautious assertion
2026-05-21 03:21:36 +02:00
Robert Sachunsky
bf7ec0233d
ModelZoo.load_model: use memory_limit instead of memory_growth…
...
- growth strategy is more flexible, but uses much more VRAM
- limit strategy needs to be calibrated to models (currently fixed),
and batch size, but needs much less VRAM and is faster
2026-05-21 02:43:34 +02:00
Robert Sachunsky
94a5e9da14
ModelZoo.load_model: avoid attempting to load exported models as Keras
...
models (which causes a warning), but switch to TF-Serving import right away
2026-05-21 02:41:19 +02:00
Robert Sachunsky
7f2bf715df
ModelZoo.load_model: fix loading exported vs saved models
2026-05-21 02:39:59 +02:00
Robert Sachunsky
3de1407d18
drop unnecessary TF / Torch imports
2026-05-21 02:38:20 +02:00
Robert Sachunsky
bdfebd2c70
reload_weights: save() → export() w/ serve() inference
2026-05-19 03:40:18 +02:00
Robert Sachunsky
86adaf299a
training.models.transformer_block: tf.reshape → Keras Reshape layer
2026-05-19 03:40:16 +02:00
Robert Sachunsky
9efce5e9f2
Predictor.shutdown: use join() instead of terminate()
2026-05-19 03:40:07 +02:00
Robert Sachunsky
ffe5cdc519
ModelZoo.shutdown: drop extra del (already done by shutdown())
2026-05-19 03:40:05 +02:00
Robert Sachunsky
481c286da9
ModelZoo.load_model: no XLA compilation
2026-05-19 03:40:05 +02:00
Robert Sachunsky
f329e10a80
test_layout: rm ignored --allow_scaling option
2026-05-19 03:40:04 +02:00
Robert Sachunsky
17b311441a
model_zoo: also parse comma/colon syntax for device in Torch case
2026-05-19 03:40:03 +02:00
Robert Sachunsky
be4fe8c263
contour: drop unused functions depending on rotation_image_new()
2026-05-19 03:40:02 +02:00
Robert Sachunsky
87cce6c963
CLI tests: add opt-in envvar EYNOLLAH_OPTIONS for device selection,
...
model directory etc.
2026-05-19 03:40:01 +02:00
Robert Sachunsky
1ed633bc25
test_model_zoo: adapt (load_models instead of load_model)
2026-05-19 03:40:00 +02:00
Robert Sachunsky
21ecb043f7
CLIs: move --device option to group level
2026-05-19 03:39:59 +02:00
Robert Sachunsky
7ed1a1ebac
CLIs: allow -h and show defaults uniformly, harmonise help, drop
...
remaining redundant negative options
2026-05-19 03:39:56 +02:00
Robert Sachunsky
cd62f13872
eynollah_ocr: make work again, re-use Eynollah base class…
...
- re-use Eynollah base class
- use `ModelZoo.load_models()` instead of `load_model()`
- pass in `device` init kwarg, delegate to `ModelZoo.load_models()`
- `device`: return Torch device at loaded model tensors
instead of ad-hoc selection
- make numeric init kwargs non-optional (only numeric)
2026-05-19 03:39:55 +02:00
Robert Sachunsky
ded668a256
model_zoo: fix clash between Predictor and direct (OCR) use-cases…
...
- `load_models()`: uniformly handle arg types
- `load_model()`: move handling of non-model categories
to `load_models()`
- `load_model()`: move SavedModel preference over HDF5 to `model_path()`
- `_load_ocr_model()`: add user-selected device handling and reporting
for Torch (as for TF)
- `_load_ocr_model()`: move (TF-based) CNN-RNN case to `load_model()`
(including Keras layer mapping)
- `shutdown()`: only apply `shutdown()` to Predictor model types
2026-05-19 03:39:53 +02:00
Robert Sachunsky
98e6fbbcbb
mbreorder: make work again, re-use Eynollah base class
2026-05-19 03:39:52 +02:00
Robert Sachunsky
7e8b9311d3
Revert "test_model_zoo: fix calls"
...
This reverts commit 5a98f55be3 .
2026-05-19 03:32:37 +02:00
Robert Sachunsky
a1449da1d1
Revert "fix model loading in mb_ro and ocr"
...
This reverts commit 218a95e6a0 .
2026-05-19 03:32:19 +02:00
kba
1df32eba87
CD: base docker image: typo {,v}3.13.0
2026-05-11 13:41:30 +02:00
kba
d7337a3080
CD: base docker image on versioned ocrd/core-cuda-tf2:v3.13.0
2026-05-11 13:38:36 +02:00
kba
e612db2bb1
📦 v0.8.0
2026-05-11 13:16:30 +02:00
kba
6cfbd93ac7
📝 changelog
2026-05-11 13:14:56 +02:00
kba
c7104c2852
Merge branch 'prepare-release-v0.8.0'
2026-05-11 13:12:19 +02:00
kba
5a98f55be3
test_model_zoo: fix calls
2026-05-11 12:22:24 +02:00
kba
218a95e6a0
fix model loading in mb_ro and ocr
2026-05-11 12:19:20 +02:00
kba
2035b07b55
Merge remote-tracking branch 'bertsky/ro-fixes-final' into prepare-release-v0.8.0
...
# Conflicts:
# requirements-ocr.txt
2026-05-11 09:46:17 +02:00
Robert Sachunsky
db87aa995d
reqs for OCR: relax ad5f2272 (depending on Python version)
2026-05-11 03:15:54 +02:00
Robert Sachunsky
e183937c5d
separate_lines_new2: fix coord overflow by clipping, simplify…
...
- found positive and negative peaks, and even more so their
relative offsets, may overflow in the cropped image,
causing fake textlines; avoid that by clipping to the valid
y coordinates
- calculation for number of tiles: sometimes one less
tile is needed by making the previous last tile
half-full on the right side
- add some (commented) plotting
- simplify (a lot, but only partially)
2026-05-11 03:09:02 +02:00
Robert Sachunsky
130f0aee42
do_work_of_slopes_curved: improve on d257869d…
...
- relative images now need larger relative min_area
(i.e. compensation factors)
- do not attempt (even) single-line skew estimation
(via linear regression) if there is no (large enough)
contour at all
- avoid re-computing `mask_parent`
- add some (commented) plotting
2026-05-11 03:03:04 +02:00
kba
ce5d6bc43c
try to accomodate outdaten Python versions unsupported by current transformers
2026-05-09 18:03:40 +02:00
kba
03f3f9af17
update model zoo and docs to link to v0_8_0 model release on zenodo
2026-05-09 17:58:59 +02:00
Robert Sachunsky
a61fb09ec5
CI: drop py3.8 (u/a for new req transformers >= 5)
2026-05-09 04:14:49 +02:00
Robert Sachunsky
4406a0299e
update CLI test for binarization…
...
- update expected log messages
2026-05-09 04:12:19 +02:00
Robert Sachunsky
4cd398bd0d
standalone binarization: update, simplify…
...
- re-use Eynollah base class, drop copied code
- simplify `run()` and `run_single()`
- delegate to `do_prediction()`
instead of custom (old) tiling loop
- drop `predict()`
- add `--device` option to CLI as well
2026-05-09 04:12:02 +02:00
Robert Sachunsky
29abae0144
update CLI test for enhancer…
...
- update expected log messages
- force `-ncu 3`, because otherwise
the example images would not be deemed
in need of enhancement
2026-05-09 02:59:52 +02:00