Commit graph

1384 commits

Author SHA1 Message Date
Robert Sachunsky
0d3a8eacba improve/update docs/train.md 2026-02-05 17:12:48 +01:00
Robert Sachunsky
b1633dfc7c training.generate_gt: for RO, skip files if regionRefs are missing 2026-02-05 17:12:48 +01:00
Robert Sachunsky
5d0c26b629 training.train: use std Keras data loader for classification
(much more efficient, works with std F1 metric)
2026-02-05 17:12:48 +01:00
Robert Sachunsky
f03124f747 training.train: simplify+fix classification data loaders…
- unify `generate_data_from_folder_training` w/ `..._evaluation`
- instead of recreating array after every batch, just zero out
- cast image results to uint8 instead of uint16
- cast categorical results to float instead of int
2026-02-05 17:12:48 +01:00
Robert Sachunsky
82d649061a training.train: fix F1 metric score setup 2026-02-05 17:12:48 +01:00
Robert Sachunsky
5c7801a1d6 training.train: simplify config args for model builder 2026-02-05 17:12:48 +01:00
Robert Sachunsky
4a65ee0c67 training.train: more config dependencies…
- make more config_params keys dependent on each other
- re-order accordingly
- in main, initialise them (as kwarg), so sacred actually
  allows overriding them by named config file
2026-02-05 11:53:19 +01:00
Robert Sachunsky
7562317da5 training: fix+simplify load_model logic for continue_training
- add missing combination `transformer` (w/ patch encoder and
  `weighted_loss`)
- add assertion to prevent wrong loss type being configured
2026-02-04 17:35:38 +01:00
Robert Sachunsky
1581094141 training: extend index_start to tasks classification and RO 2026-02-04 17:35:12 +01:00
Robert Sachunsky
e85003db4a training: re-instate index_start, reflect cfg dependency
- `index_start`: re-introduce cfg key, pass to Keras `Model.fit`
  as `initial_epoch`
- make config keys `index_start` and `dir_of_start_model` dependent
  on `continue_training`
- improve description
2026-02-04 17:32:24 +01:00
kba
586077fbcd 📦 v0.7.0 2026-01-30 16:40:55 +01:00
kba
4ade0f788f 📝 changelog 2026-01-29 17:33:35 +01:00
kba
f13560726e Merge remote-tracking branch 'origin/adding-cnn-rnn-training-script' into 2026-01-29-training
# Conflicts:
#	src/eynollah/training/inference.py
2026-01-29 17:32:08 +01:00
Robert Sachunsky
25153ad307 training: add IoU metric 2026-01-29 12:20:42 +01:00
Robert Sachunsky
d1e8a02fd4 training: fix epoch size calculation 2026-01-29 12:20:42 +01:00
Robert Sachunsky
29a0f19cee training: simplify image preprocessing…
- `utils.provide_patches`: split up loop into
  * `utils.preprocess_img` (single img function)
  * `utils.preprocess_imgs` (top-level loop)
- capture exceptions for all cases (not just some)
  at top level and with informative logging
- avoid repeating / delegating config keys in several
  places: only as kwargs to `preprocess_img()`
- read files into memory only once, then re-use
- improve readability (avoiding long lines, repeated code)
2026-01-29 12:20:42 +01:00
kba
87190f8997 Merge branch 'adding-cnn-rnn-training-script-rfct' into 2026-01-29-training
# Conflicts:
#	src/eynollah/training/models.py
2026-01-29 10:27:36 +01:00
kba
a76de1e182 Merge branch 'adding-cnn-rnn-training-script' into 2026-01-29-training 2026-01-29 10:26:34 +01:00
kba
ef3cf02877 Merge branch 'ruff-training' into 2026-01-29-training 2026-01-29 10:26:14 +01:00
Robert Sachunsky
e69b35b49c training.train.config_params: re-organise to reflect dependencies
- re-order keys belonging together logically
- make keys dependent on each other
2026-01-29 03:01:57 +01:00
Robert Sachunsky
0372fd7a1e training.gt_gen_utils: fix+simplify cropping…
when parsing `PrintSpace` or `Border` from PAGE-XML,
- use `lxml` XPath instead of nested loops
- convert points to polygons directly
  (instead of painting on canvas and retrieving contours)
- pass result bbox in slice notation
  (instead of xywh)
2026-01-29 03:01:57 +01:00
Robert Sachunsky
acda9c84ee training.gt_gen_utils: improve XML→img path mapping…
when matching files in `dir_images` by XML path name stem,
 * use `dict` instead of `list` to assign reliably
 * filter out `.xml` files (so input directories can be mixed)
 * show informative warnings for files which cannot be matched
2026-01-29 03:01:57 +01:00
Robert Sachunsky
eb92760f73 training: download pretrained RESNET weights if missing 2026-01-29 03:01:57 +01:00
Robert Sachunsky
6a81db934e improve docs/train.md 2026-01-29 03:01:57 +01:00
Robert Sachunsky
87d7ffbdd8 training: use proper Keras callbacks and top-level loop 2026-01-29 03:01:57 +01:00
vahidrezanezhad
f9695cd7be Merge branch 'adding-cnn-rnn-training-script' of https://github.com/qurator-spk/eynollah into adding-cnn-rnn-training-script 2026-01-28 11:52:36 +01:00
vahidrezanezhad
3500167870 weights ensembling for tensorflow models is integrated 2026-01-28 11:52:12 +01:00
vahidrezanezhad
33f6a231bc fix: prevent crash when printspace is missing in xmls used for label generation 2026-01-26 17:30:26 +01:00
vahidrezanezhad
6ae244bf9b Fix filename stem extraction using binarization. Restore the CNN-RNN model to its previous version, as setting channels_last alone was insufficient for running on both CPU and GPU. Prevent errors caused by null values in image shape elements. 2026-01-26 15:04:47 +01:00
vahidrezanezhad
30f39e7383 mapregion is added to labels 2026-01-26 13:56:34 +01:00
vahidrezanezhad
c8240905a8 Fix label generation by selecting largest contour when erosion splits shapes 2026-01-26 13:36:24 +01:00
Robert Sachunsky
3c3effcfda drop TF1 vernacular, relax TF/Keras and Torch requirements…
- do not restrict TF version, but depend on tf-keras and
  set `TF_USE_LEGACY_KERAS=1` to avoid Keras 3 behaviour
- relax Numpy version requirement up to v2
- relax Torch version requirement
- drop TF1 session management code
- drop TF1 config in favour of TF2 config code for memory growth
- training.*: also simplify and limit line length
- training.train: always train with TensorBoard callback
2026-01-20 11:34:02 +01:00
Robert Sachunsky
e2754da4f5 adapt to Numpy 1.25 changes…
(esp. `np.array(...)` now not allowed on ragged arrays unless
 `dtype=object`, but then coercing sub-arrays to `object` as well)
2026-01-20 04:04:07 +01:00
kba
9ccc495b4a wip 2025-12-19 14:57:10 +01:00
vahidrezanezhad
49261fa99b CNN–RNN–OCR inference and adaptation of the CNN–RNN–OCR model to support inference on both CPU and GPU 2025-12-17 15:12:39 +01:00
vahidrezanezhad
6ee79c7320 evaluation with a given GT is only possible for segmentation tasks 2025-12-17 13:28:02 +01:00
vahidrezanezhad
4651000191 debuging input shape + enable finetuning a model 2025-12-15 11:36:09 +01:00
vahidrezanezhad
4fc3ff33cb The cnn-rnn ocr model can be trained now 2025-12-09 17:22:12 +01:00
vahidrezanezhad
84a72a128b cnn-rnn model can be called - model input height and width are dynamic now - data generator is also callable 2025-12-09 15:30:19 +01:00
vahidrezanezhad
59e5a73654 adding cnn-rnn training script 2025-12-08 19:30:57 +01:00
vahidrezanezhad
7bf5e077d9 Restore correct execution of export_textline_images_and_text 2025-12-03 15:40:52 +01:00
vahidrezanezhad
6ac37af2f8 Fix eynollah ocr --help so it works again 2025-12-03 14:11:47 +01:00
vahidrezanezhad
d687d862d6 Restored correct functionality of the extract_only_images mode and cleaned up the argument handling 2025-12-03 12:01:42 +01:00
Robert Sachunsky
9fdae72e96 utils_ocr.return_textline_contour: gen cv2-like contours (w/ ndim=3, as in all other places) 2025-12-03 03:04:46 +01:00
Robert Sachunsky
ad8f8167c2 separate_lines/_vertical: gen cv2-like contours (w/ ndim=3, as in all other places) 2025-12-03 00:58:26 +01:00
Robert Sachunsky
43a95842bd writer: also ensure validity after scaling 2025-12-02 16:35:32 +01:00
kba
51abe9617a log to STDERR not STDOUT 2025-12-02 15:00:33 +01:00
Robert Sachunsky
56e73bf72f deskewing: add a 2nd stage for precision
after selecting the optimum angle on the original
search range, narrow down around in the vicinity
with half the range (adding computational costs,
but gaining precision)
2025-11-28 18:27:58 +01:00
Robert Sachunsky
adcea47bc0 return_boxes_of_images_by_order_of_reading_new: always erode
when passing the text region mask, do not apply erosion only
if there are more than 2 columns, but iff `not erosion_hurts`
(consistent with `find_num_col`'s expectations and making
 it as easy to find the column gaps on 1 and 2-column pages
 as on multi-column pages)
2025-11-28 18:23:59 +01:00
Robert Sachunsky
5a3de3b42d column detection: improve, aided by vseps whenever possible
- `find_number_of_columns_in_document`: retain vertical separators
  and pass to `find_num_col` for each vertical split
- `return_boxes_of_images_by_order_of_reading_new`: reconstruct
  the vertical separators from the segmentation mask and the separator
  bboxes; pass it on to `find_num_col` everywhere
- `return_boxes_of_images_by_order_of_reading_new`: no need to
  try-catch `find_num_col` anymore
- `return_boxes_of_images_by_order_of_reading_new`: when a vertical
  split has too few columns,
  * do not raise but lower the threshold `multiplier` responsible for
    allowing gaps as column boundaries
  * do not pass the `num_col_classifier` (i.e. expected number of
    resulting columns) of the entire page to the iterative
    `find_num_col` for each existing column, but only the portion
    of that span
2025-11-28 18:14:24 +01:00