eynollah

mirror of https://github.com/qurator-spk/eynollah.git synced 2026-07-14 07:39:15 +02:00

Author	SHA1	Message	Date
Robert Sachunsky	96cfddf92d	split_textregion_main_vs_header: avoid zero division	2026-03-13 02:44:08 +01:00
Robert Sachunsky	4e9b062b84	separate_marginals_to_left_and_right...: simplify	2026-03-13 02:44:08 +01:00
Robert Sachunsky	ae0f194241	drop ProcessPoolExecutor for intra-page parallel subprocessing… (interferes with inter-page parallelism, not as useful)	2026-03-13 02:44:08 +01:00
Robert Sachunsky	becf031c65	refactor to remove data-dependency from all Eynollah methods… - `cache_images()`: only return an image dict (plus extra keys for file name stem and dpi) - don't set any attributes - `imread()`: just take from passed image dict, also add `binary` key - `resize_and_enhance_image_with_column_classifier()`: * `imread()` from image dict * set `img_bin` key for binarization result if `input_binary` * instead of `image_page_org_size` / `page_coord` attributes, set `img_page` / `coord_page` in image dict * instead of retval, set `img_res` in image dict * also set `scale_x` and `scale_y` in image dict, resp. * simplify - `resize_image_with_column_classifier()`: * `imread()` from image dict * (as in `resize_and_enhance_with_column_classifier`:) call `calculate_width_height_by_columns_1_2` if `num_col` is 1 or 2 here * instead of retval, set `img_res` in image dict * also set `scale_x` and `scale_y` in image dict, resp. * simplify - `calculate_width_height_by_columns()`: simplify, get confidence of num_col instead of entire array - `extract_page()`: read `img_res` from image dict; simplify - `early_page_for_num_of_column_classification()`: `imread()` from image dict; simplify - `textline_contours()`: no need for `num_col_classifier` here - `run_textline()`: no need for `num_col_classifier` here - `get_regions_light_v()` → `get_regions()`: read `img_res` from image dict * get shapes via `img` from image dict instead of `image_org` attr * use `img_page` / `coord_page` from image dict instead of attrs * avoid unnecessary 3-channel arrays * simplify - `get_tables_from_model()`: no need for `num_col_classifier` here - `run_graphics_and_columns_light()` → `run_graphics_and_columns()`: * pass through image dict instead of `img_bin` (which really was `img_res`) * simplify - `run_graphics_and_columns_without_layout()`: * pass through image dict instead of `img_bin` (which really was `img_res`) * simplify - `run_enhancement()`: pass through image dict - `get_image_and_sclaes()`: drop - `run_boxes_full_layout()`: pass `image_page` instead of `img_bin` (which really was `image_page`) * simplify - `run()`: * instantiate plotter outside of loop, and independent of img files * move writer instantiation and overwrite checks into `run_single()` * add try/catch for `run_single()` w/ logging - `reset_file_name_dir`: drop - `run_single()`: * add some args/kwargs from `run()` * call `cache_images()` (reading image dict) here * instantiate writer here instead of (reused) attr in `run()` * set `scale_x` / `scale_y` in writer from image dict once known (i.e. after `run_enhancement()`) * don't return anything, but write PAGE result here - `check_any_text_region_in_model_one_is_main_or_header_light()` → `split_textregion_main_vs_header()` - plotter: * pass `name` (file stem) from image dict to all methods * for `write_images_into_directory()`: also `scale_x` and `scale_y` from image dict - writer: * init with width/height - ocrd processor: * adapt (just `run_single()` call) * drop `max_workers=1` restriction (can now run fully parallel) - `get_textregion_contours_in_org_image_light()` → `get_textregion_confidences()`: * take shape from confmat directly instead of extra array * simplify	2026-03-13 02:44:08 +01:00
Robert Sachunsky	800c55b826	predictor: fix spawn vs fork / parent vs child contexts	2026-03-13 02:44:07 +01:00
Robert Sachunsky	64281768a9	run_graphics_and_columns_light: fix double 1-off error… When `num_col_classifier` predicted result gets bypassed by heuristic result from `find_num_col()` (because prediction had too little confidence or `calculate_width_height_by_columns()` would have become too large), do not increment `num_col` further (already 1 more than colseps).	2026-03-12 10:18:14 +01:00
Robert Sachunsky	46c5f52491	CLI: don't append `/models_eynollah` here (already in default_specs)	2026-03-11 02:40:53 +01:00
Robert Sachunsky	10214dfdda	predictor: make sure all shared arrays get freed eventually	2026-03-11 02:40:53 +01:00
Robert Sachunsky	cf5caa1eca	predictor: fix termination for pytests… - rename `terminate` → `stopped` - call `terminate()` from superclass during shutdown - del `self.model_zoo` in the parent process after spawn, and in the child during shutdown	2026-03-11 02:40:53 +01:00
Robert Sachunsky	bb468bf68f	predictor: mp.Value must come from spawn context, too	2026-03-11 02:27:47 +01:00
Robert Sachunsky	9f127a0783	introduce predictor subprocess for exclusive GPU processing… - new class `Predictor(multiprocessing.Process)` as stand-in for EynollahModelZoo: * calling `load_models()` starts the subprocess (and has `.model_zoo.load_models()` run internally) * calling `get()` yields a stand-in that supports `.predict()`, which actually communicates with the singleton subprocess via task and result queues, sharing Numpy arrays via SHM * calling `predict()` with an empty dict (instead of an image) merely retrieves the respective model's output shapes (cached) * shared memory objects for arrays are cleared as soon as possible * log messages are piped through QueueHandler / QueueListener * exceptions are passed through the queues, and raised afterwards - move all TF initialization to the predictor	2026-03-07 03:54:16 +01:00
Robert Sachunsky	6f4ec53f7e	wrap_layout_model_resized/patched: compile `call` instead of `predict` (so `predict()` can directly convert back to Numpy)	2026-03-07 03:52:14 +01:00
Robert Sachunsky	338c4a0edf	wrap layout models for prediction (image resize or tiling) all in TF (to avoid back and forth between CPU and GPU memory when looping over image patches) - `patch_encoder`: define `Model` subclasses which take an existing (layout segmentation) model in the constructor, and define a new `call()` using the existing model in a GPU-only `tf.function`: * `wrap_layout_model_resized`: just `tf.image.resize()` from input image to model size, then predict, then resize back * `wrap_layout_model_patched`: ditto if smaller than model size; otherwise use `tf.image.extract_patches` for patching in a sliding-window approach, then predict patches one by one, then `tf.scatter_nd` to reconstruct to image size - when compiling `tf.function` graph, make sure to use input signature with variable image size, but avoid retracing each new size sample - in `EynollahModelZoo.load_model` for relevant model types, also wrap the loaded model * by `wrap_layout_model_resized` under model name + `_resized` * by `wrap_layout_model_patched` under model name + `_patched` - introduce `do_prediction_new_concept_autosize`, replacing `do_prediction/_new_concept`, but using passed model's `predict` directly without resizing or tiling to model size - instead of `do_prediction/_new_concept(True, ...)`, now call `do_prediction_new_concept_autosize`, but with `_patched` appended to model name - instead of `do_prediction/_new_concept(False, ...)`, now call `do_prediction_new_concept_autosize`, but with `_resized` appended to model name	2026-03-07 03:33:44 +01:00
Robert Sachunsky	f33fd57da8	model_zoo: resolve path names coming in from caller (CLI) (to make relative paths work)	2026-03-05 00:50:32 +01:00
Robert Sachunsky	41dccb216c	use (generalized) `do_prediction()` instead of `predict_enhancement()`	2026-03-05 00:50:32 +01:00
Robert Sachunsky	341480e9a0	do_prediction: if img was too small for model, also upscale results (i.e. resize back to match original size after prediction)	2026-03-05 00:50:32 +01:00
Robert Sachunsky	8ebbe65c17	`textline_contours`: remove unnecessary `resize_image`, simplify	2026-03-05 00:50:32 +01:00
Robert Sachunsky	3370a3aa85	do_prediction: avoid 3-channel results, simplify further… - `do_prediction/_new_concept`: avoid unnecessary `np.repeat` on results, aggregate intermediate artificial class mask and confidence data in extra arrays - callers: avoid unnecessary thresholding the result arrays - callers: adapt (no need to slice into channels) - simplify by refactoring thresholding and skeletonization into function `seg_mask_label` - `extract_text_regions`: drop unused second result array - `textline_contours`: avoid calculating unused unpatched prediction	2026-03-05 00:50:32 +01:00
Robert Sachunsky	ff7dc31a68	do_prediction*: rename identifiers for artificial class thresholding - `do_prediction_new_concept` w/ patches: remove branches for `thresholding_for_artificial_class` (never used, wrong name) - `do_prediction_new_concept` w/ patches: rename kwarg `thresholding_for_some_classes` → `thresholding_for_artificial_class` - `do_prediction_new_concept`: introduce kwarg `artificial_class` (for baked constant 4) - `do_prediction`: introduce kwarg `artificial_class` (for baked constant 2) - `do_prediction/_new_concept`: rename kwargs `thresholding_for..._in_light_version` → `thresholding_for...` - `do_prediction`: rename kwarg `threshold_art_class_textline` → `threshold_art_class` - `do_prediction_new_concept`: rename kwarg `threshold_art_class_layout` → `threshold_art_class`	2026-03-02 13:08:11 +01:00
Robert Sachunsky	b9cf68b51a	training: fix `b6d2440c`	2026-03-01 20:00:05 +01:00
Robert Sachunsky	686f1d34aa	do_prediction*: simplify (esp. indexing/slicing)	2026-03-01 04:37:20 +01:00
Robert Sachunsky	3b56fa2a5b	training: plot GT/prediction and metrics before training (commented)	2026-02-28 20:11:12 +01:00
Robert Sachunsky	e47653f684	training: move nCC metric/loss to .metrics and rename… - `num_connected_components_regression` → `connected_components_loss` - move from training.train to training.metrics	2026-02-28 20:11:12 +01:00
Robert Sachunsky	361d40c064	training: improve nCC metric/loss - measure localized congruence… - instead of just comparing the number of connected components, calculate the GT/pred label incidence matrix and retrieve the share of singular values (i.e. nearly diagonal under reordering) over total counts as similarity score - also, suppress artificial class in that	2026-02-28 20:11:12 +01:00
Robert Sachunsky	7e06ab2c8c	training: add config param add_ncc_loss for layout/binarization… - add `metrics.metrics_superposition` and `metrics.Superposition` - if non-zero, mix configured loss with weighted nCC metric	2026-02-28 20:11:12 +01:00
Robert Sachunsky	c6d9dd7945	training: use mixed precision and XLA (commented; does not work, yet)	2026-02-28 20:10:53 +01:00
Robert Sachunsky	c1d8a72edc	training: shuffle tf.data pipelines	2026-02-28 20:10:53 +01:00
Robert Sachunsky	1cff937e72	training: make data pipeline in `7888fa5` more efficient	2026-02-28 20:10:53 +01:00
Robert Sachunsky	f8dd5a328c	training: make plotting `18607e0f` more efficient… - avoid control dependencies in model path - store only every 3rd sample	2026-02-28 20:10:53 +01:00
Robert Sachunsky	2d5de8e595	training.models: use bilinear instead of nearest upsampling… (to benefit from CUDA optimization)	2026-02-27 12:48:28 +01:00
Robert Sachunsky	ba954d6314	training.models: fix `daa084c3`	2026-02-27 12:47:59 +01:00
Robert Sachunsky	7c3aeda65e	training.models: fix `9b66867c`	2026-02-27 12:40:56 +01:00
Robert Sachunsky	439ca350dd	training: add metric ConfusionMatrix and plot it to TensorBoard	2026-02-26 13:55:37 +01:00
Robert Sachunsky	b6d2440ce1	training.utils.preprocess_imgs: fix polymorphy in `27f43c1`… (Functions cannot be both generators and procedures, so make this a pure generator and save the image files on the caller's side; also avoids passing output directories) Moreover, simplify by moving the `os.listdir` into the function body (saving lots of extra variable bindings).	2026-02-25 20:39:15 +01:00
Robert Sachunsky	42bab0f935	docs/train: document `--missing-printspace=project`	2026-02-25 13:18:40 +01:00
Robert Sachunsky	4202a1b2db	training.generate-gt.pagexml2label: add `--missing-printspace`… - keep default (fallback to full page), but warn - new option `skip` - new option `project`	2026-02-25 11:16:21 +01:00
Robert Sachunsky	7823ea2c95	training.train: add early stopping for OCR	2026-02-25 00:16:07 +01:00
Robert Sachunsky	36e370aa45	training.train: add validation data for OCR	2026-02-25 00:10:43 +01:00
Robert Sachunsky	b399db3c00	training.models: simplify CTC loss layer	2026-02-24 20:43:50 +01:00
Robert Sachunsky	92fc2bd815	training.train: fix data batching for OCR in `27f43c17`	2026-02-24 20:42:08 +01:00
Robert Sachunsky	86b009bc31	training.utils.preprocess_imgs: fix file name stemming `27f43c17`	2026-02-24 20:41:08 +01:00
Robert Sachunsky	20a3672be3	training.utils.preprocess_imgs: fix file shuffling in `27f43c17`	2026-02-24 20:37:44 +01:00
Robert Sachunsky	658dade0d4	training.config_params: `flip_index` needed for `scaling_flip`, too	2026-02-24 20:36:00 +01:00
Robert Sachunsky	abf111de76	training: add metric for (same) number of connected components (in trying to capture region instance separability)	2026-02-24 17:03:21 +01:00
Robert Sachunsky	18607e0f48	training: plot predictions to TB logs along with training/testing	2026-02-24 17:00:48 +01:00
Robert Sachunsky	56833b3f55	training: fix data representation in `7888fa5`… (Eynollah models expet BGR/float instead of RGB/int)	2026-02-24 16:46:19 +01:00
Robert Sachunsky	a9496bbc70	enhancer/mbreorder: use std Keras data loader for classification	2026-02-17 18:39:30 +01:00
Robert Sachunsky	003c88f18a	fix double import in `82266f82`	2026-02-17 18:23:32 +01:00
Robert Sachunsky	f61effe8ce	fix typo in `c8240905`	2026-02-17 18:20:58 +01:00
Robert Sachunsky	5f71333649	fix missing import in `49261fa9`	2026-02-17 18:11:49 +01:00

1 2 3 4 5 ...

1405 commits