eynollah

mirror of https://github.com/qurator-spk/eynollah.git synced 2026-08-03 09:22:32 +02:00

Author	SHA1	Message	Date
Robert Sachunsky	08946067ac	ModelZoo ONNX backend: handle multiple inputs, too	2026-06-12 14:54:51 +02:00
Robert Sachunsky	9d2412080f	training.models for cnn-rnn-ocr: fix config names for height/width… - rename `image_height` → `input_height` - rename `image_width` → `input_width`	2026-06-12 14:52:23 +02:00
Robert Sachunsky	4181e03bc9	`training convert --rebuild` for cnn-rnn-ocr: override charset file… when rebuilding the inference model for cnn-rnn-ocr, - open the old `characters_org.txt` file for the charset - use it to pass the actual `n_classes` (overriding the config) - use its path to pass the `characters_txt_file` (overriding the config)	2026-06-12 14:48:47 +02:00
Robert Sachunsky	348ac95ad3	Eynollah_ocr: drop fixed input sizes… - tr-ocr: no need to resize images in advance (done by model, anyway) - cnn-rnn-ocr: get model size from model's input shape	2026-06-03 20:59:00 +02:00
Robert Sachunsky	24c7d4c277	update trocr smoke test, add cnnrnn ocr smoke test	2026-06-03 20:58:05 +02:00
Robert Sachunsky	27ca9733db	ModelZoo ONNX backend for inference: support multi-input or -output	2026-06-03 20:57:02 +02:00
Robert Sachunsky	38fe4d33ad	Predictor for multi-input models: present as list instead of tuple… (because TF-Serving expects that and cannot cast)	2026-06-03 20:56:00 +02:00
Robert Sachunsky	4e7e1c06b9	trocr viarant for Predictor runtime: no model size for input_shape… Because transformers v4 and v5 API for image preprocessor differs, and the model-internal image input sizes are actually irrelevant, because the preprocessor will resize them anyway, and there is no batch dimension (because the input images will have different shapes), do not advertise this information in `.input_shape`.	2026-06-03 20:51:56 +02:00
Robert Sachunsky	f447a9f248	trocr: move preprocessor and decoder into model object, too… - ModelZoo: drop `trocr_processor` model type - `ModelZoo.load_models()`: use Predictor for `ocr_tr` models, too - `ModelZoo.load_model()`: for `ocr_tr`, load processor and model, then define a function object as stand-in for the common model interface based on Keras (w/ `.predict_on_batch()`) - Predictor: allow multi-input without actual batch dimension for `ocr_tr` models (because the model takes a list of original image arrays and resizes them to model shape internally) - Eynollah_ocr: adapt (replacing preprocessing, prediction and decoding steps by a single `.predict()` call)	2026-06-03 03:41:44 +02:00
Robert Sachunsky	d2f2a1e06b	Eynollah_ocr: correctly handle min_conf, improve writer… - `min_conf_value_of_textline_text`: apply by skipping lines below threshold (instead of writing empty text), and delete their TextEquiv (if existing) - `write_ocr()`: simplify, and ensure consistency between line and region level text correctly	2026-06-03 00:43:46 +02:00
Robert Sachunsky	8ffc4ed8d3	Eynollah_ocr: adapt to inference model, improve and simplify… - drop `end_character` mechanics and `characters` model type for decoding output probability (not needed) - drop `decode_batch_predictions()` and `num_to_char` model type (part of inference model) - drop roughshot confidence estimation calculation (returned precisely by inference model) - adapt model prediction to inference model: just omit zeros, map to bytes, filter OOV tokens and decode UTF-8 to str - if no binarization input was provided, then compute it on the fly using `binarization` model - also apply `min_conf_value_of_textline_text` (as for TrOCR) - batching over entire page instead of region-wise (which underfilled batches) - simplify and avoid copied redundant code - rename `extracted_conf_value_merged` → `extracted_confs_merged` - move `batched()` from `utils.utils_ocr` to `utils` - drop `utils_ocr.distortion_free_resize()` (not needed) - simplify `utils_ocr.break_curved_line_into_small_pieces_and_then_merge()` - drop `utils_ocr.return_textline_contour_with_added_box_coordinate()` and `utils_ocr.return_rnn_cnn_ocr_of_given_textlines()` (not needed)	2026-06-02 21:20:06 +02:00
Robert Sachunsky	a391ee24e6	Predictor: handle multi-input and/or multi-output cases	2026-06-02 21:18:22 +02:00
Robert Sachunsky	c79b73dcc8	cnn-rnn-ocr: move CTC decoder and string decoder to inference model… - ModelZoo: drop `num_to_char` and `characters` model types, also drop `_load_characters()` and `_load_num_to_char()` loaders - `ModelZoo.load_models()`: use Predictor for `ocr` models, too - `ModelZoo.load_model()`: delegate runtime/inference conversion of OCR models to `eynollah.training.models.cnn_rnn_ocr_model4inference` - `training.models`: add (purely functional) Keras layer `CTCDecoder` for inference on top of softmax output, but using TF backend function instead of (broken) `Keras.backend.ctc_decode()`, while switching to beam search (instead of greedy) and also returning decoded path probability - `training.models.cnn_rnn_ocr_model()` w/ `inference=True`: * add kwarg `characters_txt_file` for file path of character set * configure secondary tensor path on OCR graph for binarized input (additional input `image_bin`, averaging softmax outputs) * use new `CTCDecoder` layer and inverse `StringLookup` layer to decode from softmax output to tf.string; so inference models now have 2 inputs (RGB, binarized) and 2 outputs (text, prob) * since `np.dtype=object` cannot be handled by SharedMemory (as needed by Predictor queues), also replace tf.string by tf.uint8 arrays * use this for `training convert` for OCR models w/ `--rebuild` - `training.models.cnn_rnn_ocr_model4inference`: * new function which does the same but loads an existing OCR model in training configuration (i.e. without prior `inference=True`) * use this for `training convert` for OCR models w/o `--rebuild`	2026-06-02 20:26:42 +02:00
Robert Sachunsky	13f2f81c45	ModelZoo: support inference with ONNX/TensorRT… - comment out ad-hoc conversion/loading of autosized models - refactor predictor backends for model types into separate functions - only attempt inference conversion of cnn-rnn-ocr model if applicable (`ctc_loss` layer still present) - apply VRAM limits across model types (Keras, TF-Serving, ONNX) - apply TF device selection across model types (Keras, TF-Serving) - implement predictor backend for ONNX models: - using onnxruntime - covering CUDA and TensorRT providers - trying to support manual device selection - hiding session management details - converting float32 to float16	2026-05-28 18:08:08 +02:00
Robert Sachunsky	f833a516e7	training: add CLI command `convert`… - move `train_cli` from cli.py to train.py, add docstring - add `convert_cli`: - load any (supported) model format (i.e. not exported TF-Serving or ONNX) - if SavedModel format with `config.json` present, and `--rebuild` is requested, create new model from `models.get_model()` for this configuration, and load weights - if model type is `cnn-rnn-ocr` and configuration is still for training (`ctc_loss`), then extract inference model - apply requested `--format` conversion: HDF5, Keras native, Keras SavedModel, TF-Serving SavedModel or ONNX - if output format is directory (i.e. SavedModel), then copy over `config.json`, too - reload-models-v0.8.mk: - adapt recipe for converter CLI (i.e. `--format tf-serving` w/ `--rebuild` if possible) - add targets for other useful data formats - extend list of model names to all current models (as all benefit from TF-Serving export) - cancel ONNX conversion for vision transformer models (as these do not work, yet)	2026-05-28 17:48:21 +02:00
Robert Sachunsky	62b55a3809	train params: drop `reload_weights`, re-use `dir_of_start_model`… - drop ad-hoc configuration parameter `reload_weights` (used for conversion/export of models for inference, to be replaced by extra CLI) - re-interprete `dir_of_start_model` to also load weights if not `continue_training`	2026-05-28 17:42:55 +02:00
Robert Sachunsky	093030f503	train/models: move all model builders to `models.get_model()`… - models: add new `get_model()`, passing in Sacred config to capture builder function arguments - train: fewer imports - train: no need to pass `custom_objects` if loading with `compile=False` (and we custom-compile later, anyway)	2026-05-28 17:37:45 +02:00
Robert Sachunsky	faef1967f8	models.cnn_rnn_ocr_model: add `inference` option, drop model name	2026-05-28 17:33:57 +02:00
Robert Sachunsky	c4a7eec5b3	models: cosmetics - using `Reshape`, do not pass `target_shape` as kwarg - add a default `name` for `Patches` and `PatchEncoder`	2026-05-27 01:58:21 +02:00
Robert Sachunsky	9801129aa6	estimate_skew_contours: ensure retval is always float	2026-05-22 12:37:07 +02:00
Robert Sachunsky	26afc5ddab	ModelZoo: ensure exported TensorShape is converted to plain tuple	2026-05-22 12:35:44 +02:00
Robert Sachunsky	0836230c6b	utils_ocr: avoid module-level import of TF	2026-05-21 22:50:53 +02:00
Robert Sachunsky	f3a93983c0	ModelZoo: add `ocr` key for `memory_limit`	2026-05-21 22:50:13 +02:00
Robert Sachunsky	ea41dcae1d	trocr: use beam search instead of greedy decoding	2026-05-21 17:52:27 +02:00
Robert Sachunsky	074753a98e	ModelZoo: fix Torch device selection	2026-05-21 17:25:53 +02:00
Robert Sachunsky	000e4ac8d8	trocr: extract confidence, too	2026-05-21 17:25:39 +02:00
Robert Sachunsky	f3649adbf2	trocr: apply `do_not_mask_with_textline_contour` here, too	2026-05-21 17:23:11 +02:00
Robert Sachunsky	1d67e65f11	trocr: simplify, batch over entire page… - batching over entire page instead of region-wise (underfilling batches) - avoid copied redundant code	2026-05-21 15:48:21 +02:00
Robert Sachunsky	d50bd7c650	trocr: avoid warnings by passing `clean_up_tokenization_spaces=False`	2026-05-21 14:20:51 +02:00
Robert Sachunsky	f9f9130dbb	do_order_of_regions: remove redundant+overcautious assertion	2026-05-21 03:21:36 +02:00
Robert Sachunsky	bf7ec0233d	ModelZoo.load_model: use `memory_limit` instead of `memory_growth`… - growth strategy is more flexible, but uses much more VRAM - limit strategy needs to be calibrated to models (currently fixed), and batch size, but needs much less VRAM and is faster	2026-05-21 02:43:34 +02:00
Robert Sachunsky	94a5e9da14	ModelZoo.load_model: avoid attempting to load exported models as Keras models (which causes a warning), but switch to TF-Serving import right away	2026-05-21 02:41:19 +02:00
Robert Sachunsky	7f2bf715df	ModelZoo.load_model: fix loading exported vs saved models	2026-05-21 02:39:59 +02:00
Robert Sachunsky	3de1407d18	drop unnecessary TF / Torch imports	2026-05-21 02:38:20 +02:00
Robert Sachunsky	bdfebd2c70	reload_weights: `save()` → `export()` w/ `serve()` inference	2026-05-19 03:40:18 +02:00
Robert Sachunsky	86adaf299a	training.models.transformer_block: tf.reshape → Keras Reshape layer	2026-05-19 03:40:16 +02:00
Robert Sachunsky	9efce5e9f2	Predictor.shutdown: use `join()` instead of `terminate()`	2026-05-19 03:40:07 +02:00
Robert Sachunsky	ffe5cdc519	ModelZoo.shutdown: drop extra `del` (already done by `shutdown()`)	2026-05-19 03:40:05 +02:00
Robert Sachunsky	481c286da9	ModelZoo.load_model: no XLA compilation	2026-05-19 03:40:05 +02:00
Robert Sachunsky	f329e10a80	test_layout: rm ignored `--allow_scaling` option	2026-05-19 03:40:04 +02:00
Robert Sachunsky	17b311441a	model_zoo: also parse comma/colon syntax for `device` in Torch case	2026-05-19 03:40:03 +02:00
Robert Sachunsky	be4fe8c263	contour: drop unused functions depending on `rotation_image_new()`	2026-05-19 03:40:02 +02:00
Robert Sachunsky	87cce6c963	CLI tests: add opt-in envvar `EYNOLLAH_OPTIONS` for device selection, model directory etc.	2026-05-19 03:40:01 +02:00
Robert Sachunsky	1ed633bc25	test_model_zoo: adapt (`load_models` instead of `load_model`)	2026-05-19 03:40:00 +02:00
Robert Sachunsky	21ecb043f7	CLIs: move `--device` option to group level	2026-05-19 03:39:59 +02:00
Robert Sachunsky	7ed1a1ebac	CLIs: allow `-h` and show defaults uniformly, harmonise help, drop remaining redundant negative options	2026-05-19 03:39:56 +02:00
Robert Sachunsky	cd62f13872	eynollah_ocr: make work again, re-use Eynollah base class… - re-use Eynollah base class - use `ModelZoo.load_models()` instead of `load_model()` - pass in `device` init kwarg, delegate to `ModelZoo.load_models()` - `device`: return Torch device at loaded model tensors instead of ad-hoc selection - make numeric init kwargs non-optional (only numeric)	2026-05-19 03:39:55 +02:00
Robert Sachunsky	ded668a256	model_zoo: fix clash between Predictor and direct (OCR) use-cases… - `load_models()`: uniformly handle arg types - `load_model()`: move handling of non-model categories to `load_models()` - `load_model()`: move SavedModel preference over HDF5 to `model_path()` - `_load_ocr_model()`: add user-selected device handling and reporting for Torch (as for TF) - `_load_ocr_model()`: move (TF-based) CNN-RNN case to `load_model()` (including Keras layer mapping) - `shutdown()`: only apply `shutdown()` to Predictor model types	2026-05-19 03:39:53 +02:00
Robert Sachunsky	98e6fbbcbb	mbreorder: make work again, re-use Eynollah base class	2026-05-19 03:39:52 +02:00
Robert Sachunsky	7e8b9311d3	Revert "test_model_zoo: fix calls" This reverts commit `5a98f55be3`.	2026-05-19 03:32:37 +02:00

1 2 3 4 5 ...

1576 commits