eynollah

mirror of https://github.com/qurator-spk/eynollah.git synced 2026-07-14 07:39:15 +02:00

Author	SHA1	Message	Date
Robert Sachunsky	338c4a0edf	wrap layout models for prediction (image resize or tiling) all in TF (to avoid back and forth between CPU and GPU memory when looping over image patches) - `patch_encoder`: define `Model` subclasses which take an existing (layout segmentation) model in the constructor, and define a new `call()` using the existing model in a GPU-only `tf.function`: * `wrap_layout_model_resized`: just `tf.image.resize()` from input image to model size, then predict, then resize back * `wrap_layout_model_patched`: ditto if smaller than model size; otherwise use `tf.image.extract_patches` for patching in a sliding-window approach, then predict patches one by one, then `tf.scatter_nd` to reconstruct to image size - when compiling `tf.function` graph, make sure to use input signature with variable image size, but avoid retracing each new size sample - in `EynollahModelZoo.load_model` for relevant model types, also wrap the loaded model * by `wrap_layout_model_resized` under model name + `_resized` * by `wrap_layout_model_patched` under model name + `_patched` - introduce `do_prediction_new_concept_autosize`, replacing `do_prediction/_new_concept`, but using passed model's `predict` directly without resizing or tiling to model size - instead of `do_prediction/_new_concept(True, ...)`, now call `do_prediction_new_concept_autosize`, but with `_patched` appended to model name - instead of `do_prediction/_new_concept(False, ...)`, now call `do_prediction_new_concept_autosize`, but with `_resized` appended to model name	2026-03-07 03:33:44 +01:00
Robert Sachunsky	f33fd57da8	model_zoo: resolve path names coming in from caller (CLI) (to make relative paths work)	2026-03-05 00:50:32 +01:00
Robert Sachunsky	41dccb216c	use (generalized) `do_prediction()` instead of `predict_enhancement()`	2026-03-05 00:50:32 +01:00
Robert Sachunsky	341480e9a0	do_prediction: if img was too small for model, also upscale results (i.e. resize back to match original size after prediction)	2026-03-05 00:50:32 +01:00
Robert Sachunsky	8ebbe65c17	`textline_contours`: remove unnecessary `resize_image`, simplify	2026-03-05 00:50:32 +01:00
Robert Sachunsky	3370a3aa85	do_prediction: avoid 3-channel results, simplify further… - `do_prediction/_new_concept`: avoid unnecessary `np.repeat` on results, aggregate intermediate artificial class mask and confidence data in extra arrays - callers: avoid unnecessary thresholding the result arrays - callers: adapt (no need to slice into channels) - simplify by refactoring thresholding and skeletonization into function `seg_mask_label` - `extract_text_regions`: drop unused second result array - `textline_contours`: avoid calculating unused unpatched prediction	2026-03-05 00:50:32 +01:00
Robert Sachunsky	ff7dc31a68	do_prediction*: rename identifiers for artificial class thresholding - `do_prediction_new_concept` w/ patches: remove branches for `thresholding_for_artificial_class` (never used, wrong name) - `do_prediction_new_concept` w/ patches: rename kwarg `thresholding_for_some_classes` → `thresholding_for_artificial_class` - `do_prediction_new_concept`: introduce kwarg `artificial_class` (for baked constant 4) - `do_prediction`: introduce kwarg `artificial_class` (for baked constant 2) - `do_prediction/_new_concept`: rename kwargs `thresholding_for..._in_light_version` → `thresholding_for...` - `do_prediction`: rename kwarg `threshold_art_class_textline` → `threshold_art_class` - `do_prediction_new_concept`: rename kwarg `threshold_art_class_layout` → `threshold_art_class`	2026-03-02 13:08:11 +01:00
Robert Sachunsky	b9cf68b51a	training: fix `b6d2440c`	2026-03-01 20:00:05 +01:00
Robert Sachunsky	686f1d34aa	do_prediction*: simplify (esp. indexing/slicing)	2026-03-01 04:37:20 +01:00
Robert Sachunsky	3b56fa2a5b	training: plot GT/prediction and metrics before training (commented)	2026-02-28 20:11:12 +01:00
Robert Sachunsky	e47653f684	training: move nCC metric/loss to .metrics and rename… - `num_connected_components_regression` → `connected_components_loss` - move from training.train to training.metrics	2026-02-28 20:11:12 +01:00
Robert Sachunsky	361d40c064	training: improve nCC metric/loss - measure localized congruence… - instead of just comparing the number of connected components, calculate the GT/pred label incidence matrix and retrieve the share of singular values (i.e. nearly diagonal under reordering) over total counts as similarity score - also, suppress artificial class in that	2026-02-28 20:11:12 +01:00
Robert Sachunsky	7e06ab2c8c	training: add config param add_ncc_loss for layout/binarization… - add `metrics.metrics_superposition` and `metrics.Superposition` - if non-zero, mix configured loss with weighted nCC metric	2026-02-28 20:11:12 +01:00
Robert Sachunsky	c6d9dd7945	training: use mixed precision and XLA (commented; does not work, yet)	2026-02-28 20:10:53 +01:00
Robert Sachunsky	c1d8a72edc	training: shuffle tf.data pipelines	2026-02-28 20:10:53 +01:00
Robert Sachunsky	1cff937e72	training: make data pipeline in `7888fa5` more efficient	2026-02-28 20:10:53 +01:00
Robert Sachunsky	f8dd5a328c	training: make plotting `18607e0f` more efficient… - avoid control dependencies in model path - store only every 3rd sample	2026-02-28 20:10:53 +01:00
Robert Sachunsky	2d5de8e595	training.models: use bilinear instead of nearest upsampling… (to benefit from CUDA optimization)	2026-02-27 12:48:28 +01:00
Robert Sachunsky	ba954d6314	training.models: fix `daa084c3`	2026-02-27 12:47:59 +01:00
Robert Sachunsky	7c3aeda65e	training.models: fix `9b66867c`	2026-02-27 12:40:56 +01:00
Robert Sachunsky	439ca350dd	training: add metric ConfusionMatrix and plot it to TensorBoard	2026-02-26 13:55:37 +01:00
Robert Sachunsky	b6d2440ce1	training.utils.preprocess_imgs: fix polymorphy in `27f43c1`… (Functions cannot be both generators and procedures, so make this a pure generator and save the image files on the caller's side; also avoids passing output directories) Moreover, simplify by moving the `os.listdir` into the function body (saving lots of extra variable bindings).	2026-02-25 20:39:15 +01:00
Robert Sachunsky	42bab0f935	docs/train: document `--missing-printspace=project`	2026-02-25 13:18:40 +01:00
Robert Sachunsky	4202a1b2db	training.generate-gt.pagexml2label: add `--missing-printspace`… - keep default (fallback to full page), but warn - new option `skip` - new option `project`	2026-02-25 11:16:21 +01:00
Robert Sachunsky	7823ea2c95	training.train: add early stopping for OCR	2026-02-25 00:16:07 +01:00
Robert Sachunsky	36e370aa45	training.train: add validation data for OCR	2026-02-25 00:10:43 +01:00
Robert Sachunsky	b399db3c00	training.models: simplify CTC loss layer	2026-02-24 20:43:50 +01:00
Robert Sachunsky	92fc2bd815	training.train: fix data batching for OCR in `27f43c17`	2026-02-24 20:42:08 +01:00
Robert Sachunsky	86b009bc31	training.utils.preprocess_imgs: fix file name stemming `27f43c17`	2026-02-24 20:41:08 +01:00
Robert Sachunsky	20a3672be3	training.utils.preprocess_imgs: fix file shuffling in `27f43c17`	2026-02-24 20:37:44 +01:00
Robert Sachunsky	658dade0d4	training.config_params: `flip_index` needed for `scaling_flip`, too	2026-02-24 20:36:00 +01:00
Robert Sachunsky	abf111de76	training: add metric for (same) number of connected components (in trying to capture region instance separability)	2026-02-24 17:03:21 +01:00
Robert Sachunsky	18607e0f48	training: plot predictions to TB logs along with training/testing	2026-02-24 17:00:48 +01:00
Robert Sachunsky	56833b3f55	training: fix data representation in `7888fa5`… (Eynollah models expet BGR/float instead of RGB/int)	2026-02-24 16:46:19 +01:00
Robert Sachunsky	a9496bbc70	enhancer/mbreorder: use std Keras data loader for classification	2026-02-17 18:39:30 +01:00
Robert Sachunsky	003c88f18a	fix double import in `82266f82`	2026-02-17 18:23:32 +01:00
Robert Sachunsky	f61effe8ce	fix typo in `c8240905`	2026-02-17 18:20:58 +01:00
Robert Sachunsky	5f71333649	fix missing import in `49261fa9`	2026-02-17 18:11:49 +01:00
Robert Sachunsky	67fca82f38	fix missing import in `27f43c17`	2026-02-17 18:09:15 +01:00
Robert Sachunsky	6a4163ae56	fix typo in `27f43c17`	2026-02-17 18:09:15 +01:00
Robert Sachunsky	c1b5cc92af	fix typo in `7562317d`	2026-02-17 18:09:15 +01:00
Robert Sachunsky	7bef8fa95a	training.train: add verbose=1 consistently	2026-02-17 18:09:15 +01:00
Robert Sachunsky	9b66867c21	training.models: re-use transformer builder code	2026-02-17 18:09:15 +01:00
Robert Sachunsky	daa084c367	training.models: re-use UNet decoder builder code	2026-02-17 18:09:15 +01:00
Robert Sachunsky	fcd10c3956	training.models: re-use RESNET50 builder (+weight init) code	2026-02-17 18:09:15 +01:00
Robert Sachunsky	4414f7b89b	training.models.vit_resnet50_unet: re-use `IMAGE_ORDERING`	2026-02-17 14:18:32 +01:00
Robert Sachunsky	7888fa5968	training: remove `data_gen` in favor of tf.data pipelines instead of looping over file pairs indefinitely, yielding Numpy arrays: re-use `keras.utils.image_dataset_from_directory` here as well, but with img/label generators zipped together (thus, everything will already be loaded/prefetched on the GPU)	2026-02-17 12:44:45 +01:00
Robert Sachunsky	83c2408192	training.utils.data_gen: avoid repeated array allocation	2026-02-17 12:44:45 +01:00
Robert Sachunsky	514a897dd5	training.train: assert n_epochs vs. index_start	2026-02-17 12:44:45 +01:00
Robert Sachunsky	37338049af	training: use relative imports	2026-02-17 12:44:45 +01:00

1 2 3 4 5 ...

1393 commits