- ModelZoo: drop `num_to_char` and `characters` model types,
also drop `_load_characters()` and `_load_num_to_char()` loaders
- `ModelZoo.load_models()`: use Predictor for `ocr` models, too
- `ModelZoo.load_model()`: delegate runtime/inference conversion of
OCR models to `eynollah.training.models.cnn_rnn_ocr_model4inference`
- `training.models`: add (purely functional) Keras layer `CTCDecoder`
for inference on top of softmax output, but using TF backend
function instead of (broken) `Keras.backend.ctc_decode()`, while
switching to beam search (instead of greedy) and also returning
decoded path probability
- `training.models.cnn_rnn_ocr_model()` w/ `inference=True`:
* add kwarg `characters_txt_file` for file path of character set
* configure secondary tensor path on OCR graph for binarized input
(additional input `image_bin`, averaging softmax outputs)
* use new `CTCDecoder` layer and inverse `StringLookup` layer to
decode from softmax output to tf.string; so inference models
now have 2 inputs (RGB, binarized) and 2 outputs (text, prob)
* since `np.dtype=object` cannot be handled by SharedMemory (as
needed by Predictor queues), also replace tf.string by tf.uint8
arrays
* use this for `training convert` for OCR models w/ `--rebuild`
- `training.models.cnn_rnn_ocr_model4inference`:
* new function which does the same but loads an existing OCR model
in training configuration (i.e. without prior `inference=True`)
* use this for `training convert` for OCR models w/o `--rebuild`
- comment out ad-hoc conversion/loading of autosized models
- refactor predictor backends for model types into separate functions
- only attempt inference conversion of cnn-rnn-ocr model
if applicable (`ctc_loss` layer still present)
- apply VRAM limits across model types
(Keras, TF-Serving, ONNX)
- apply TF device selection across model types
(Keras, TF-Serving)
- implement predictor backend for ONNX models:
- using onnxruntime
- covering CUDA and TensorRT providers
- trying to support manual device selection
- hiding session management details
- converting float32 to float16
- move `train_cli` from cli.py to train.py,
add docstring
- add `convert_cli`:
- load any (supported) model format
(i.e. not exported TF-Serving or ONNX)
- if SavedModel format with `config.json` present,
and `--rebuild` is requested, create new model
from `models.get_model()` for this configuration,
and load weights
- if model type is `cnn-rnn-ocr` and configuration
is still for training (`ctc_loss`), then extract
inference model
- apply requested `--format` conversion:
HDF5, Keras native, Keras SavedModel, TF-Serving SavedModel
or ONNX
- if output format is directory (i.e. SavedModel),
then copy over `config.json`, too
- reload-models-v0.8.mk:
- adapt recipe for converter CLI (i.e. `--format tf-serving`
w/ `--rebuild` if possible)
- add targets for other useful data formats
- extend list of model names to all current models
(as all benefit from TF-Serving export)
- cancel ONNX conversion for vision transformer models
(as these do not work, yet)
- drop ad-hoc configuration parameter `reload_weights`
(used for conversion/export of models for inference,
to be replaced by extra CLI)
- re-interprete `dir_of_start_model` to also load weights
if not `continue_training`
- models: add new `get_model()`, passing in Sacred config
to capture builder function arguments
- train: fewer imports
- train: no need to pass `custom_objects` if loading with
`compile=False` (and we custom-compile later, anyway)
- growth strategy is more flexible, but uses much more VRAM
- limit strategy needs to be calibrated to models (currently fixed),
and batch size, but needs much less VRAM and is faster
- re-use Eynollah base class
- use `ModelZoo.load_models()` instead of `load_model()`
- pass in `device` init kwarg, delegate to `ModelZoo.load_models()`
- `device`: return Torch device at loaded model tensors
instead of ad-hoc selection
- make numeric init kwargs non-optional (only numeric)
- `load_models()`: uniformly handle arg types
- `load_model()`: move handling of non-model categories
to `load_models()`
- `load_model()`: move SavedModel preference over HDF5 to `model_path()`
- `_load_ocr_model()`: add user-selected device handling and reporting
for Torch (as for TF)
- `_load_ocr_model()`: move (TF-based) CNN-RNN case to `load_model()`
(including Keras layer mapping)
- `shutdown()`: only apply `shutdown()` to Predictor model types
- found positive and negative peaks, and even more so their
relative offsets, may overflow in the cropped image,
causing fake textlines; avoid that by clipping to the valid
y coordinates
- calculation for number of tiles: sometimes one less
tile is needed by making the previous last tile
half-full on the right side
- add some (commented) plotting
- simplify (a lot, but only partially)