Commit graph

363 commits

Author SHA1 Message Date
vahidrezanezhad
6ae244bf9b Fix filename stem extraction using binarization. Restore the CNN-RNN model to its previous version, as setting channels_last alone was insufficient for running on both CPU and GPU. Prevent errors caused by null values in image shape elements. 2026-01-26 15:04:47 +01:00
vahidrezanezhad
30f39e7383 mapregion is added to labels 2026-01-26 13:56:34 +01:00
vahidrezanezhad
c8240905a8 Fix label generation by selecting largest contour when erosion splits shapes 2026-01-26 13:36:24 +01:00
vahidrezanezhad
49261fa99b CNN–RNN–OCR inference and adaptation of the CNN–RNN–OCR model to support inference on both CPU and GPU 2025-12-17 15:12:39 +01:00
vahidrezanezhad
6ee79c7320 evaluation with a given GT is only possible for segmentation tasks 2025-12-17 13:28:02 +01:00
vahidrezanezhad
4651000191 debuging input shape + enable finetuning a model 2025-12-15 11:36:09 +01:00
vahidrezanezhad
4fc3ff33cb The cnn-rnn ocr model can be trained now 2025-12-09 17:22:12 +01:00
vahidrezanezhad
84a72a128b cnn-rnn model can be called - model input height and width are dynamic now - data generator is also callable 2025-12-09 15:30:19 +01:00
vahidrezanezhad
59e5a73654 adding cnn-rnn training script 2025-12-08 19:30:57 +01:00
vahidrezanezhad
7bf5e077d9 Restore correct execution of export_textline_images_and_text 2025-12-03 15:40:52 +01:00
vahidrezanezhad
6ac37af2f8 Fix eynollah ocr --help so it works again 2025-12-03 14:11:47 +01:00
vahidrezanezhad
d687d862d6 Restored correct functionality of the extract_only_images mode and cleaned up the argument handling 2025-12-03 12:01:42 +01:00
kba
51abe9617a log to STDERR not STDOUT 2025-12-02 15:00:33 +01:00
kba
b161e33854 🔥 refactor eynollah ocr
.
2025-11-28 15:45:21 +01:00
kba
30f9c695dc move line-gt extraction out of ocr to eynollah-training 2025-11-28 15:12:31 +01:00
kba
9bcfeab057 💀 remove dead code from eynollah.py 2025-11-28 12:52:28 +01:00
kba
5171e09c2d eynollah.py: fix kwargs to writer 2025-11-28 12:52:28 +01:00
kba
c24cf94bce enforce kwargs for writer.build_... 2025-11-28 12:52:28 +01:00
kba
4aa9543a7d remove more branches after textline_light default true 2025-11-27 11:30:00 +01:00
kba
177d555ded factor out extract_only_images as eynollah extract-images 2025-11-26 21:37:00 +01:00
kba
83e8b289da 🔥 drop light_version/textline_light (now default and implied) 2025-11-26 20:48:22 +01:00
kba
ca83cf934d fix imports from src/cli/cli_*/*_cli 2025-11-26 20:48:14 +01:00
kba
095b36c389 models: split into layout, extra and ocr
layout: Everything not OCR or extra
ocr: trocr/cnnrnn models
extra: obsolete or niche models
2025-11-26 19:49:59 +01:00
kba
000af16a47 🔥 remove torch pinning 2025-11-26 19:23:49 +01:00
kba
e503c1a0b7 drop obsolete multi-model binarization 2025-11-26 18:51:41 +01:00
kba
82266f8234 reorganize cli 2025-11-26 18:51:20 +01:00
kba
5a1900e664 🔥 remove OCR option from eynollah layout 2025-11-26 18:12:03 +01:00
kba
0f410c2e7c disable tf/keras logging on first import 2025-11-26 16:37:54 +01:00
kba
9d9d32daed update OCR-D bindings 2025-11-26 16:20:27 +01:00
vahidrezanezhad
ed5b5c13dd Add test images; call TrOCR processor from the same directory as the TrOCR model 2025-11-07 12:47:21 +01:00
kba
f902756ce1 try importing torch, then shapely, then tensorflow 2025-11-06 13:10:35 +01:00
kba
d224b0f7e8 try with shapely.set_precision(...mode="keep_collpased") 2025-11-06 11:55:40 +01:00
kba
9ab565fa02 model basedir might be a symlink 2025-10-29 21:02:42 +01:00
kba
4772fd17e2 missed changing override mechanism in eynollah_ocr 2025-10-29 20:47:13 +01:00
kba
29c273685f fix merge issues 2025-10-29 20:15:19 +01:00
kba
de76eabc1d Merge branch 'cli-logging' into model-zoo 2025-10-29 19:41:01 +01:00
kba
5e22e9db64 model_zoo: make type str to reduce importing overhead 2025-10-29 19:16:35 +01:00
kba
a913bdf7dc make --model-basedir and --model-overrides top-level CLI options 2025-10-29 18:48:41 +01:00
kba
b6f82c72b9 refactor cli tests 2025-10-29 17:23:21 +01:00
kba
ef999c8f0a Merge branch 'model-zoo' of lx0145.sbb.spk-berlin.de:/data/eynollah into model-zoo 2025-10-27 11:45:20 +01:00
kba
294b6356d3 wip 2025-10-27 11:45:16 +01:00
kba
51d2680d9c wip 2025-10-27 11:44:59 +01:00
kba
ec1fd93dad wip 2025-10-23 11:58:23 +02:00
kba
883546a6b8 eynollah models package 2025-10-22 17:05:40 +02:00
kba
04bc4a63d0 reorganize model_zoo 2025-10-22 16:04:48 +02:00
kba
d94285b3ea rewrite model spec data structure 2025-10-22 13:07:35 +02:00
kba
146658f026 eynollah layout: fix trocr_processor model_zoo call 2025-10-22 10:48:26 +02:00
kba
4c8abfe19c eynollah_ocr: actually replace the model calls 2025-10-22 10:48:26 +02:00
kba
1337461d47 adopt image_enhancer to the zoo 2025-10-21 19:24:55 +02:00
kba
f0c86672f8 adopt mb_ro_on_layout to the zoo 2025-10-21 17:55:08 +02:00