eynollah

mirror of https://github.com/qurator-spk/eynollah.git synced 2026-06-28 07:49:21 +02:00

Author	SHA1	Message	Date
Robert Sachunsky	27f43c175f	Merge branch 'main' into ro-fixes and resolve conflicts… major conflicts resolved manually: - branches for non-`light` segmentation already removed in main - Keras/TF setup and no TF1 sessions, esp. in new ModelZoo - changes to binarizer and its CLI (`mode`, `overwrite`, `run_single()`) - writer: `build...` w/ kwargs instead of positional - training for segmentation/binarization/enhancement tasks: * drop unused `generate_data_from_folder()` * simplify `preprocess_imgs()`: turn `preprocess_img()`, `get_patches()` and `get_patches_num_scale_new()` into generators, only writing result files in the caller (top-level loop) instead of passing output directories and file counter - training for new OCR task: * `train`: put keys into additional `config_params` where they belong, resp. (conditioned under existing keys), and w/ better documentation * `train`: add new keys as kwargs to `run()` to make usable * `utils`: instead of custom data loader `data_gen_ocr()`, re-use existing `preprocess_imgs()` (for cfg capture and top-level loop), but extended w/ new kwargs and calling new `preprocess_img_ocr()`; the latter as single-image generator (also much simplified) * `train`: use tf.data loader pipeline from that generator w/ standard mechanisms for batching, shuffling, prefetching etc. * `utils` and `train`: instead of `vectorize_label`, use `Dataset.padded_batch` * add TensorBoard callback and re-use our checkpoint callback * also use standard Keras top-level loop for training still problematic (substantially unresolved): - `Patches` now only w/ fixed implicit size (ignoring training config params) - `PatchEncoder` now only w/ fixed implicit num patches and projection dim (ignoring training config params)	2026-02-07 14:05:56 +01:00
Robert Sachunsky	6944d31617	modify manual RO preference… in `return_boxes_of_images_by_order_of_reading_new`, when the next multicol separator ends in the same column, do not recurse into subspan if the next starts earlier (but continue with top span to the right first)	2026-02-05 17:58:32 +01:00
Robert Sachunsky	b1633dfc7c	training.generate_gt: for RO, skip files if regionRefs are missing	2026-02-05 17:12:48 +01:00
Robert Sachunsky	5d0c26b629	training.train: use std Keras data loader for classification (much more efficient, works with std F1 metric)	2026-02-05 17:12:48 +01:00
Robert Sachunsky	f03124f747	training.train: simplify+fix classification data loaders… - unify `generate_data_from_folder_training` w/ `..._evaluation` - instead of recreating array after every batch, just zero out - cast image results to uint8 instead of uint16 - cast categorical results to float instead of int	2026-02-05 17:12:48 +01:00
Robert Sachunsky	82d649061a	training.train: fix F1 metric score setup	2026-02-05 17:12:48 +01:00
Robert Sachunsky	5c7801a1d6	training.train: simplify config args for model builder	2026-02-05 17:12:48 +01:00
Robert Sachunsky	4a65ee0c67	training.train: more config dependencies… - make more config_params keys dependent on each other - re-order accordingly - in main, initialise them (as kwarg), so sacred actually allows overriding them by named config file	2026-02-05 11:53:19 +01:00
Robert Sachunsky	7562317da5	training: fix+simplify `load_model` logic for `continue_training` - add missing combination `transformer` (w/ patch encoder and `weighted_loss`) - add assertion to prevent wrong loss type being configured	2026-02-04 17:35:38 +01:00
Robert Sachunsky	1581094141	training: extend `index_start` to tasks classification and RO	2026-02-04 17:35:12 +01:00
Robert Sachunsky	e85003db4a	training: re-instate `index_start`, reflect cfg dependency - `index_start`: re-introduce cfg key, pass to Keras `Model.fit` as `initial_epoch` - make config keys `index_start` and `dir_of_start_model` dependent on `continue_training` - improve description	2026-02-04 17:32:24 +01:00
kba	586077fbcd	📦 v0.7.0	2026-01-30 16:40:55 +01:00
kba	f13560726e	Merge remote-tracking branch 'origin/adding-cnn-rnn-training-script' into 2026-01-29-training # Conflicts: # src/eynollah/training/inference.py	2026-01-29 17:32:08 +01:00
Robert Sachunsky	25153ad307	training: add IoU metric	2026-01-29 12:20:42 +01:00
Robert Sachunsky	d1e8a02fd4	training: fix epoch size calculation	2026-01-29 12:20:42 +01:00
Robert Sachunsky	29a0f19cee	training: simplify image preprocessing… - `utils.provide_patches`: split up loop into * `utils.preprocess_img` (single img function) * `utils.preprocess_imgs` (top-level loop) - capture exceptions for all cases (not just some) at top level and with informative logging - avoid repeating / delegating config keys in several places: only as kwargs to `preprocess_img()` - read files into memory only once, then re-use - improve readability (avoiding long lines, repeated code)	2026-01-29 12:20:42 +01:00
kba	87190f8997	Merge branch 'adding-cnn-rnn-training-script-rfct' into 2026-01-29-training # Conflicts: # src/eynollah/training/models.py	2026-01-29 10:27:36 +01:00
kba	a76de1e182	Merge branch 'adding-cnn-rnn-training-script' into 2026-01-29-training	2026-01-29 10:26:34 +01:00
Robert Sachunsky	e69b35b49c	training.train.config_params: re-organise to reflect dependencies - re-order keys belonging together logically - make keys dependent on each other	2026-01-29 03:01:57 +01:00
Robert Sachunsky	0372fd7a1e	training.gt_gen_utils: fix+simplify cropping… when parsing `PrintSpace` or `Border` from PAGE-XML, - use `lxml` XPath instead of nested loops - convert points to polygons directly (instead of painting on canvas and retrieving contours) - pass result bbox in slice notation (instead of xywh)	2026-01-29 03:01:57 +01:00
Robert Sachunsky	acda9c84ee	training.gt_gen_utils: improve XML→img path mapping… when matching files in `dir_images` by XML path name stem, * use `dict` instead of `list` to assign reliably * filter out `.xml` files (so input directories can be mixed) * show informative warnings for files which cannot be matched	2026-01-29 03:01:57 +01:00
Robert Sachunsky	eb92760f73	training: download pretrained RESNET weights if missing	2026-01-29 03:01:57 +01:00
Robert Sachunsky	87d7ffbdd8	training: use proper Keras callbacks and top-level loop	2026-01-29 03:01:57 +01:00
vahidrezanezhad	f9695cd7be	Merge branch 'adding-cnn-rnn-training-script' of https://github.com/qurator-spk/eynollah into adding-cnn-rnn-training-script	2026-01-28 11:52:36 +01:00
vahidrezanezhad	3500167870	weights ensembling for tensorflow models is integrated	2026-01-28 11:52:12 +01:00
vahidrezanezhad	33f6a231bc	fix: prevent crash when printspace is missing in xmls used for label generation	2026-01-26 17:30:26 +01:00
vahidrezanezhad	6ae244bf9b	Fix filename stem extraction using binarization. Restore the CNN-RNN model to its previous version, as setting channels_last alone was insufficient for running on both CPU and GPU. Prevent errors caused by null values in image shape elements.	2026-01-26 15:04:47 +01:00
vahidrezanezhad	30f39e7383	mapregion is added to labels	2026-01-26 13:56:34 +01:00
vahidrezanezhad	c8240905a8	Fix label generation by selecting largest contour when erosion splits shapes	2026-01-26 13:36:24 +01:00
Robert Sachunsky	3c3effcfda	drop TF1 vernacular, relax TF/Keras and Torch requirements… - do not restrict TF version, but depend on tf-keras and set `TF_USE_LEGACY_KERAS=1` to avoid Keras 3 behaviour - relax Numpy version requirement up to v2 - relax Torch version requirement - drop TF1 session management code - drop TF1 config in favour of TF2 config code for memory growth - training.*: also simplify and limit line length - training.train: always train with TensorBoard callback	2026-01-20 11:34:02 +01:00
Robert Sachunsky	e2754da4f5	adapt to Numpy 1.25 changes… (esp. `np.array(...)` now not allowed on ragged arrays unless `dtype=object`, but then coercing sub-arrays to `object` as well)	2026-01-20 04:04:07 +01:00
kba	9ccc495b4a	wip	2025-12-19 14:57:10 +01:00
vahidrezanezhad	49261fa99b	CNN–RNN–OCR inference and adaptation of the CNN–RNN–OCR model to support inference on both CPU and GPU	2025-12-17 15:12:39 +01:00
vahidrezanezhad	6ee79c7320	evaluation with a given GT is only possible for segmentation tasks	2025-12-17 13:28:02 +01:00
vahidrezanezhad	4651000191	debuging input shape + enable finetuning a model	2025-12-15 11:36:09 +01:00
vahidrezanezhad	4fc3ff33cb	The cnn-rnn ocr model can be trained now	2025-12-09 17:22:12 +01:00
vahidrezanezhad	84a72a128b	cnn-rnn model can be called - model input height and width are dynamic now - data generator is also callable	2025-12-09 15:30:19 +01:00
vahidrezanezhad	59e5a73654	adding cnn-rnn training script	2025-12-08 19:30:57 +01:00
vahidrezanezhad	7bf5e077d9	Restore correct execution of export_textline_images_and_text	2025-12-03 15:40:52 +01:00
vahidrezanezhad	6ac37af2f8	Fix eynollah ocr --help so it works again	2025-12-03 14:11:47 +01:00
vahidrezanezhad	d687d862d6	Restored correct functionality of the extract_only_images mode and cleaned up the argument handling	2025-12-03 12:01:42 +01:00
Robert Sachunsky	9fdae72e96	utils_ocr.return_textline_contour: gen cv2-like contours (w/ ndim=3, as in all other places)	2025-12-03 03:04:46 +01:00
Robert Sachunsky	ad8f8167c2	separate_lines/_vertical: gen cv2-like contours (w/ ndim=3, as in all other places)	2025-12-03 00:58:26 +01:00
Robert Sachunsky	43a95842bd	writer: also ensure validity after scaling	2025-12-02 16:35:32 +01:00
kba	51abe9617a	log to STDERR not STDOUT	2025-12-02 15:00:33 +01:00
Robert Sachunsky	56e73bf72f	deskewing: add a 2nd stage for precision after selecting the optimum angle on the original search range, narrow down around in the vicinity with half the range (adding computational costs, but gaining precision)	2025-11-28 18:27:58 +01:00
Robert Sachunsky	adcea47bc0	return_boxes_of_images_by_order_of_reading_new: always erode when passing the text region mask, do not apply erosion only if there are more than 2 columns, but iff `not erosion_hurts` (consistent with `find_num_col`'s expectations and making it as easy to find the column gaps on 1 and 2-column pages as on multi-column pages)	2025-11-28 18:23:59 +01:00
Robert Sachunsky	5a3de3b42d	column detection: improve, aided by vseps whenever possible - `find_number_of_columns_in_document`: retain vertical separators and pass to `find_num_col` for each vertical split - `return_boxes_of_images_by_order_of_reading_new`: reconstruct the vertical separators from the segmentation mask and the separator bboxes; pass it on to `find_num_col` everywhere - `return_boxes_of_images_by_order_of_reading_new`: no need to try-catch `find_num_col` anymore - `return_boxes_of_images_by_order_of_reading_new`: when a vertical split has too few columns, * do not raise but lower the threshold `multiplier` responsible for allowing gaps as column boundaries * do not pass the `num_col_classifier` (i.e. expected number of resulting columns) of the entire page to the iterative `find_num_col` for each existing column, but only the portion of that span	2025-11-28 18:14:24 +01:00
Robert Sachunsky	4dd40c542b	find_num_col: add optional criterion - sum of vertical separators when searching for gaps between text regions, consider the vertical separator mask (if given): add the vertical sum of vertical separators to the peak scores (making column detection more robust if still slighly skewed or partially obscured by multi-column regions, but fg seps are present)	2025-11-28 18:07:15 +01:00
Robert Sachunsky	84d10962f3	return_boxes_of_images_by_order_of_reading_new: improve - when searching for multi-col box makers, pick the right-most allowable column, not the left-most	2025-11-28 18:04:12 +01:00

1 2 3 4 5 ...

492 commits