eynollah

mirror of https://github.com/qurator-spk/eynollah.git synced 2026-08-03 09:22:32 +02:00

Author	SHA1	Message	Date
Robert Sachunsky	a391ee24e6	Predictor: handle multi-input and/or multi-output cases	2026-06-02 21:18:22 +02:00
Robert Sachunsky	c79b73dcc8	cnn-rnn-ocr: move CTC decoder and string decoder to inference model… - ModelZoo: drop `num_to_char` and `characters` model types, also drop `_load_characters()` and `_load_num_to_char()` loaders - `ModelZoo.load_models()`: use Predictor for `ocr` models, too - `ModelZoo.load_model()`: delegate runtime/inference conversion of OCR models to `eynollah.training.models.cnn_rnn_ocr_model4inference` - `training.models`: add (purely functional) Keras layer `CTCDecoder` for inference on top of softmax output, but using TF backend function instead of (broken) `Keras.backend.ctc_decode()`, while switching to beam search (instead of greedy) and also returning decoded path probability - `training.models.cnn_rnn_ocr_model()` w/ `inference=True`: * add kwarg `characters_txt_file` for file path of character set * configure secondary tensor path on OCR graph for binarized input (additional input `image_bin`, averaging softmax outputs) * use new `CTCDecoder` layer and inverse `StringLookup` layer to decode from softmax output to tf.string; so inference models now have 2 inputs (RGB, binarized) and 2 outputs (text, prob) * since `np.dtype=object` cannot be handled by SharedMemory (as needed by Predictor queues), also replace tf.string by tf.uint8 arrays * use this for `training convert` for OCR models w/ `--rebuild` - `training.models.cnn_rnn_ocr_model4inference`: * new function which does the same but loads an existing OCR model in training configuration (i.e. without prior `inference=True`) * use this for `training convert` for OCR models w/o `--rebuild`	2026-06-02 20:26:42 +02:00
Robert Sachunsky	13f2f81c45	ModelZoo: support inference with ONNX/TensorRT… - comment out ad-hoc conversion/loading of autosized models - refactor predictor backends for model types into separate functions - only attempt inference conversion of cnn-rnn-ocr model if applicable (`ctc_loss` layer still present) - apply VRAM limits across model types (Keras, TF-Serving, ONNX) - apply TF device selection across model types (Keras, TF-Serving) - implement predictor backend for ONNX models: - using onnxruntime - covering CUDA and TensorRT providers - trying to support manual device selection - hiding session management details - converting float32 to float16	2026-05-28 18:08:08 +02:00
Robert Sachunsky	f833a516e7	training: add CLI command `convert`… - move `train_cli` from cli.py to train.py, add docstring - add `convert_cli`: - load any (supported) model format (i.e. not exported TF-Serving or ONNX) - if SavedModel format with `config.json` present, and `--rebuild` is requested, create new model from `models.get_model()` for this configuration, and load weights - if model type is `cnn-rnn-ocr` and configuration is still for training (`ctc_loss`), then extract inference model - apply requested `--format` conversion: HDF5, Keras native, Keras SavedModel, TF-Serving SavedModel or ONNX - if output format is directory (i.e. SavedModel), then copy over `config.json`, too - reload-models-v0.8.mk: - adapt recipe for converter CLI (i.e. `--format tf-serving` w/ `--rebuild` if possible) - add targets for other useful data formats - extend list of model names to all current models (as all benefit from TF-Serving export) - cancel ONNX conversion for vision transformer models (as these do not work, yet)	2026-05-28 17:48:21 +02:00
Robert Sachunsky	62b55a3809	train params: drop `reload_weights`, re-use `dir_of_start_model`… - drop ad-hoc configuration parameter `reload_weights` (used for conversion/export of models for inference, to be replaced by extra CLI) - re-interprete `dir_of_start_model` to also load weights if not `continue_training`	2026-05-28 17:42:55 +02:00
Robert Sachunsky	093030f503	train/models: move all model builders to `models.get_model()`… - models: add new `get_model()`, passing in Sacred config to capture builder function arguments - train: fewer imports - train: no need to pass `custom_objects` if loading with `compile=False` (and we custom-compile later, anyway)	2026-05-28 17:37:45 +02:00
Robert Sachunsky	faef1967f8	models.cnn_rnn_ocr_model: add `inference` option, drop model name	2026-05-28 17:33:57 +02:00
Robert Sachunsky	c4a7eec5b3	models: cosmetics - using `Reshape`, do not pass `target_shape` as kwarg - add a default `name` for `Patches` and `PatchEncoder`	2026-05-27 01:58:21 +02:00
Robert Sachunsky	9801129aa6	estimate_skew_contours: ensure retval is always float	2026-05-22 12:37:07 +02:00
Robert Sachunsky	26afc5ddab	ModelZoo: ensure exported TensorShape is converted to plain tuple	2026-05-22 12:35:44 +02:00
Robert Sachunsky	0836230c6b	utils_ocr: avoid module-level import of TF	2026-05-21 22:50:53 +02:00
Robert Sachunsky	f3a93983c0	ModelZoo: add `ocr` key for `memory_limit`	2026-05-21 22:50:13 +02:00
Robert Sachunsky	ea41dcae1d	trocr: use beam search instead of greedy decoding	2026-05-21 17:52:27 +02:00
Robert Sachunsky	074753a98e	ModelZoo: fix Torch device selection	2026-05-21 17:25:53 +02:00
Robert Sachunsky	000e4ac8d8	trocr: extract confidence, too	2026-05-21 17:25:39 +02:00
Robert Sachunsky	f3649adbf2	trocr: apply `do_not_mask_with_textline_contour` here, too	2026-05-21 17:23:11 +02:00
Robert Sachunsky	1d67e65f11	trocr: simplify, batch over entire page… - batching over entire page instead of region-wise (underfilling batches) - avoid copied redundant code	2026-05-21 15:48:21 +02:00
Robert Sachunsky	d50bd7c650	trocr: avoid warnings by passing `clean_up_tokenization_spaces=False`	2026-05-21 14:20:51 +02:00
Robert Sachunsky	f9f9130dbb	do_order_of_regions: remove redundant+overcautious assertion	2026-05-21 03:21:36 +02:00
Robert Sachunsky	bf7ec0233d	ModelZoo.load_model: use `memory_limit` instead of `memory_growth`… - growth strategy is more flexible, but uses much more VRAM - limit strategy needs to be calibrated to models (currently fixed), and batch size, but needs much less VRAM and is faster	2026-05-21 02:43:34 +02:00
Robert Sachunsky	94a5e9da14	ModelZoo.load_model: avoid attempting to load exported models as Keras models (which causes a warning), but switch to TF-Serving import right away	2026-05-21 02:41:19 +02:00
Robert Sachunsky	7f2bf715df	ModelZoo.load_model: fix loading exported vs saved models	2026-05-21 02:39:59 +02:00
Robert Sachunsky	3de1407d18	drop unnecessary TF / Torch imports	2026-05-21 02:38:20 +02:00
Robert Sachunsky	bdfebd2c70	reload_weights: `save()` → `export()` w/ `serve()` inference	2026-05-19 03:40:18 +02:00
Robert Sachunsky	86adaf299a	training.models.transformer_block: tf.reshape → Keras Reshape layer	2026-05-19 03:40:16 +02:00
Robert Sachunsky	9efce5e9f2	Predictor.shutdown: use `join()` instead of `terminate()`	2026-05-19 03:40:07 +02:00
Robert Sachunsky	ffe5cdc519	ModelZoo.shutdown: drop extra `del` (already done by `shutdown()`)	2026-05-19 03:40:05 +02:00
Robert Sachunsky	481c286da9	ModelZoo.load_model: no XLA compilation	2026-05-19 03:40:05 +02:00
Robert Sachunsky	f329e10a80	test_layout: rm ignored `--allow_scaling` option	2026-05-19 03:40:04 +02:00
Robert Sachunsky	17b311441a	model_zoo: also parse comma/colon syntax for `device` in Torch case	2026-05-19 03:40:03 +02:00
Robert Sachunsky	be4fe8c263	contour: drop unused functions depending on `rotation_image_new()`	2026-05-19 03:40:02 +02:00
Robert Sachunsky	87cce6c963	CLI tests: add opt-in envvar `EYNOLLAH_OPTIONS` for device selection, model directory etc.	2026-05-19 03:40:01 +02:00
Robert Sachunsky	1ed633bc25	test_model_zoo: adapt (`load_models` instead of `load_model`)	2026-05-19 03:40:00 +02:00
Robert Sachunsky	21ecb043f7	CLIs: move `--device` option to group level	2026-05-19 03:39:59 +02:00
Robert Sachunsky	7ed1a1ebac	CLIs: allow `-h` and show defaults uniformly, harmonise help, drop remaining redundant negative options	2026-05-19 03:39:56 +02:00
Robert Sachunsky	cd62f13872	eynollah_ocr: make work again, re-use Eynollah base class… - re-use Eynollah base class - use `ModelZoo.load_models()` instead of `load_model()` - pass in `device` init kwarg, delegate to `ModelZoo.load_models()` - `device`: return Torch device at loaded model tensors instead of ad-hoc selection - make numeric init kwargs non-optional (only numeric)	2026-05-19 03:39:55 +02:00
Robert Sachunsky	ded668a256	model_zoo: fix clash between Predictor and direct (OCR) use-cases… - `load_models()`: uniformly handle arg types - `load_model()`: move handling of non-model categories to `load_models()` - `load_model()`: move SavedModel preference over HDF5 to `model_path()` - `_load_ocr_model()`: add user-selected device handling and reporting for Torch (as for TF) - `_load_ocr_model()`: move (TF-based) CNN-RNN case to `load_model()` (including Keras layer mapping) - `shutdown()`: only apply `shutdown()` to Predictor model types	2026-05-19 03:39:53 +02:00
Robert Sachunsky	98e6fbbcbb	mbreorder: make work again, re-use Eynollah base class	2026-05-19 03:39:52 +02:00
Robert Sachunsky	7e8b9311d3	Revert "test_model_zoo: fix calls" This reverts commit `5a98f55be3`.	2026-05-19 03:32:37 +02:00
Robert Sachunsky	a1449da1d1	Revert "fix model loading in mb_ro and ocr" This reverts commit `218a95e6a0`.	2026-05-19 03:32:19 +02:00
kba	1df32eba87	CD: base docker image: typo {,v}3.13.0	2026-05-11 13:41:30 +02:00
kba	d7337a3080	CD: base docker image on versioned ocrd/core-cuda-tf2:v3.13.0	2026-05-11 13:38:36 +02:00
kba	e612db2bb1	📦 v0.8.0	2026-05-11 13:16:30 +02:00
kba	6cfbd93ac7	📝 changelog	2026-05-11 13:14:56 +02:00
kba	c7104c2852	Merge branch 'prepare-release-v0.8.0'	2026-05-11 13:12:19 +02:00
kba	5a98f55be3	test_model_zoo: fix calls	2026-05-11 12:22:24 +02:00
kba	218a95e6a0	fix model loading in mb_ro and ocr	2026-05-11 12:19:20 +02:00
kba	2035b07b55	Merge remote-tracking branch 'bertsky/ro-fixes-final' into prepare-release-v0.8.0 # Conflicts: # requirements-ocr.txt	2026-05-11 09:46:17 +02:00
Robert Sachunsky	db87aa995d	reqs for OCR: relax `ad5f2272` (depending on Python version)	2026-05-11 03:15:54 +02:00
Robert Sachunsky	e183937c5d	separate_lines_new2: fix coord overflow by clipping, simplify… - found positive and negative peaks, and even more so their relative offsets, may overflow in the cropped image, causing fake textlines; avoid that by clipping to the valid y coordinates - calculation for number of tiles: sometimes one less tile is needed by making the previous last tile half-full on the right side - add some (commented) plotting - simplify (a lot, but only partially)	2026-05-11 03:09:02 +02:00

1 2 3 4 5 ...

1565 commits