Commit graph

30 commits

Author SHA1 Message Date
Robert Sachunsky
27f43c175f Merge branch 'main' into ro-fixes and resolve conflicts…
major conflicts resolved manually:

- branches for non-`light` segmentation already removed in main
- Keras/TF setup and no TF1 sessions, esp. in new ModelZoo
- changes to binarizer and its CLI (`mode`, `overwrite`, `run_single()`)
- writer: `build...` w/ kwargs instead of positional
- training for segmentation/binarization/enhancement tasks:
  * drop unused `generate_data_from_folder()`
  * simplify `preprocess_imgs()`: turn `preprocess_img()`, `get_patches()`
    and `get_patches_num_scale_new()` into generators, only writing
    result files in the caller (top-level loop) instead of passing
    output directories and file counter
- training for new OCR task:
  * `train`: put keys into additional `config_params` where they belong,
    resp. (conditioned under existing keys), and w/ better documentation
  * `train`: add new keys as kwargs to `run()` to make usable
  * `utils`: instead of custom data loader `data_gen_ocr()`, re-use
    existing `preprocess_imgs()` (for cfg capture and top-level loop),
    but extended w/ new kwargs and calling new `preprocess_img_ocr()`;
    the latter as single-image generator (also much simplified)
  * `train`: use tf.data loader pipeline from that generator w/ standard
    mechanisms for batching, shuffling, prefetching etc.
  * `utils` and `train`: instead of `vectorize_label`, use `Dataset.padded_batch`
  * add TensorBoard callback and re-use our checkpoint callback
  * also use standard Keras top-level loop for training

still problematic (substantially unresolved):
- `Patches` now only w/ fixed implicit size
  (ignoring training config params)
- `PatchEncoder` now only w/ fixed implicit num patches and projection dim
  (ignoring training config params)
2026-02-07 14:05:56 +01:00
Robert Sachunsky
0d3a8eacba improve/update docs/train.md 2026-02-05 17:12:48 +01:00
Robert Sachunsky
6a81db934e improve docs/train.md 2026-01-29 03:01:57 +01:00
Clemens Neudecker
c9efbe1871
refactor image layout in examples.md 2025-10-30 16:52:59 +01:00
cneud
46a45f6b0e Create examples.md 2025-10-29 22:23:48 +01:00
cneud
8822da17cf Merge remote-tracking branch 'origin/updating_docs' into docs_and_minor_fixes 2025-10-28 19:53:12 +01:00
vahidrezanezhad
6192e5ba5c
qualitative evaluation of ocr models are added to docs 2025-10-23 16:37:24 +02:00
vahidrezanezhad
d0ad7a98b7 starting qualitative ocr evaluation 2025-10-22 22:45:22 +02:00
vahidrezanezhad
7b7714af2e completing ocr evaluations metric 2025-10-22 22:42:37 +02:00
vahidrezanezhad
b56bb44284 providing ocr model evaluation metrics 2025-10-22 21:30:06 +02:00
cneud
7d70835d22 small fixes to main readme 2025-10-20 23:19:10 +02:00
cneud
230e7cc705 integrate ocrd docs 2025-10-20 22:52:54 +02:00
cneud
e5254dc6c5 integrate training docs 2025-10-20 22:39:54 +02:00
cneud
6e3399fe7a combine Docker docs 2025-10-20 22:16:56 +02:00
vahidrezanezhad
c8455370a9 updating heuristics and ocr documentation 2025-10-20 15:13:45 +02:00
vahidrezanezhad
3ec5ceb22e
Update flowchart 2025-10-20 14:55:14 +02:00
vahidrezanezhad
9d2dbb8388 updating model based reading orde detection 2025-10-20 14:47:55 +02:00
cneud
496a0e2ca4 readme and documentation updates 2025-10-17 19:19:26 +02:00
kba
f60e0543ab training: update docs 2025-10-01 19:16:58 +02:00
kba
733af1e9a7 📝 update train/README.md, align with docs/train.md 2025-10-01 17:43:32 +02:00
kba
9d8b858dfc remove docs/eynollah-layout, superseded by docs/model.md and docs/usage.md 2025-09-29 16:01:29 +02:00
kba
ce02a3553b 🔥 remove obsolete versions of the training document 2025-09-29 15:18:21 +02:00
kba
6d379782ab 📝 align former upstream train.md with wiki train.md syntactically 2025-09-29 15:11:02 +02:00
kba
52a7c93319 add documentation on training eynollah from sbb_pixelwise_segmentation wiki 2025-09-29 15:05:05 +02:00
kba
ea05461dfe add documentation on eynollah layout from eynollah wiki 2025-09-29 15:04:46 +02:00
kba
56c4b7af88 📝 align pre-merge docs/train.md with former upstream train.md syntactically 2025-09-29 14:59:41 +02:00
kba
3123add815 📝 update README 2025-09-26 15:07:32 +02:00
cneud
0e9a72ea52 consolidate usage documentation 2025-03-27 23:14:59 +01:00
cneud
3a55b6ce91 consolidate usage documentation 2025-03-27 23:11:18 +01:00
cneud
e9fa691308 add model and training documentation 2025-03-27 22:41:10 +01:00