Commit graph

35 commits

Author SHA1 Message Date
Robert Sachunsky
27f43c175f Merge branch 'main' into ro-fixes and resolve conflicts…
major conflicts resolved manually:

- branches for non-`light` segmentation already removed in main
- Keras/TF setup and no TF1 sessions, esp. in new ModelZoo
- changes to binarizer and its CLI (`mode`, `overwrite`, `run_single()`)
- writer: `build...` w/ kwargs instead of positional
- training for segmentation/binarization/enhancement tasks:
  * drop unused `generate_data_from_folder()`
  * simplify `preprocess_imgs()`: turn `preprocess_img()`, `get_patches()`
    and `get_patches_num_scale_new()` into generators, only writing
    result files in the caller (top-level loop) instead of passing
    output directories and file counter
- training for new OCR task:
  * `train`: put keys into additional `config_params` where they belong,
    resp. (conditioned under existing keys), and w/ better documentation
  * `train`: add new keys as kwargs to `run()` to make usable
  * `utils`: instead of custom data loader `data_gen_ocr()`, re-use
    existing `preprocess_imgs()` (for cfg capture and top-level loop),
    but extended w/ new kwargs and calling new `preprocess_img_ocr()`;
    the latter as single-image generator (also much simplified)
  * `train`: use tf.data loader pipeline from that generator w/ standard
    mechanisms for batching, shuffling, prefetching etc.
  * `utils` and `train`: instead of `vectorize_label`, use `Dataset.padded_batch`
  * add TensorBoard callback and re-use our checkpoint callback
  * also use standard Keras top-level loop for training

still problematic (substantially unresolved):
- `Patches` now only w/ fixed implicit size
  (ignoring training config params)
- `PatchEncoder` now only w/ fixed implicit num patches and projection dim
  (ignoring training config params)
2026-02-07 14:05:56 +01:00
Robert Sachunsky
3c3effcfda drop TF1 vernacular, relax TF/Keras and Torch requirements…
- do not restrict TF version, but depend on tf-keras and
  set `TF_USE_LEGACY_KERAS=1` to avoid Keras 3 behaviour
- relax Numpy version requirement up to v2
- relax Torch version requirement
- drop TF1 session management code
- drop TF1 config in favour of TF2 config code for memory growth
- training.*: also simplify and limit line length
- training.train: always train with TensorBoard callback
2026-01-20 11:34:02 +01:00
kba
04bc4a63d0 reorganize model_zoo 2025-10-22 16:04:48 +02:00
Robert Sachunsky
f0de1adabf rm loky dependency 2025-09-30 23:12:18 +02:00
vahidrezanezhad
31d9fa0c80 strings alignment function is added + new changes needed for prediction with both bin and rgb inputs is implemented 2025-05-25 21:44:36 +02:00
vahidrezanezhad
c12b09a868 I have tried to address the issues #163 and #161 . The changes have also improved marginal detection and enhanced the isolation of headers. 2025-05-12 00:10:18 +02:00
Robert Sachunsky
108ce1f5a1 Merge remote-tracking branch 'origin/main' into v3-api-release-foreal
(bad-ass difficult diff diffing)
2025-04-04 20:23:23 +02:00
Robert Sachunsky
ab3da17547
Update requirements.txt
Co-authored-by: Konstantin Baierer <kba@users.noreply.github.com>
2025-04-01 18:13:28 +02:00
Robert Sachunsky
3916474b8b OCR-D: require >=v3.1 2025-03-31 01:15:12 +02:00
Robert Sachunsky
1f4a17b60d Merge remote-tracking branch 'origin/machine_based_reading_order_integration' into v3-api 2025-03-30 21:21:59 +02:00
vahidrezanezhad
cf40f9ecc5 The rotate_image function produces the exact same rotation as Imutils. Therefore, there is no need to retain the remove-imutils-1 branch. 2025-03-28 20:58:32 +01:00
vahidrezanezhad
b55389ac62
Update requirements.txt 2025-03-28 14:59:31 +01:00
vahidrezanezhad
c9de578d4d removing imutils from requirements 2025-03-28 11:25:03 +01:00
kba
869110f185 merge main 2025-01-20 14:45:27 +01:00
Robert Sachunsky
0ae28f7d3e switch from stdlib to loky.ProcessPoolExecutor, ensure shutdown 2024-12-14 12:16:29 +00:00
Robert Sachunsky
f765e2603b move Torch to optional dependencies (to avoid clash with TF over CuDNN) 2024-12-04 15:57:13 +00:00
vahidrezanezhad
5fa8ca46a4 updating requirements 2024-11-14 17:35:00 +01:00
Clemens Neudecker
1ae77e61c8
Update requirements.txt 2024-11-11 14:11:36 +01:00
Clemens Neudecker
21893910b8
relax tf2 requirement to < 2.13 2024-10-16 14:20:53 +02:00
kba
fdedae2406 require ocrd>=3.0.0b4 2024-09-02 11:47:57 +02:00
kba
aef46a4669 require ocrd >= 3.0.0b1 2024-08-26 11:31:13 +02:00
kba
0a3f525f0a port processor to core v3 2024-08-23 18:22:25 +02:00
cneud
b3fa684395 pin tf2 version to 2.12.1
until we fix keras compatibility
2024-03-19 20:30:40 +01:00
Clemens Neudecker
03bfd7a390
Update requirements.txt
Update to `tensorflow>=2.12` (drops Python 3.7 support)
* fix #114 
* fix #115

Tested by @vahidrezanezhad @cneud
2023-09-26 18:16:20 +02:00
Clemens Neudecker
b58a327c5d
cap numpy to <1.24.0
OK so now numpy is the culprit (shipped unbound via ocrd) which had several deprecations expire with release of v1.24.0 that require changes to our codebase, e.g. 
* The deprecation for the aliases np.object, np.bool, np.float, np.complex, np.str, and np.int is expired
* Ragged array creation will now always raise a ValueError unless dtype=object is passed.

See also here: https://numpy.org/devdocs/release/1.24.0-notes.html#expired-deprecations
2023-08-17 22:07:45 +02:00
Clemens Neudecker
e5acee09ab
cap tensorflow version to <2.12.0
Cap tensorflow version to <2.12.0 until we have time to adapt to the API changes such as e.g.
* Support for Python 3.11 has been added.
* Support for Python 3.7 has been removed. 
See also https://github.com/tensorflow/tensorflow/releases/tag/v2.12.0.
2023-08-17 21:05:51 +02:00
Robert Sachunsky
34a061782c
depend on tensorflow instead of tensorflow-gpu (#76) 2022-05-03 23:19:01 +02:00
cneud
8c11b2253d update requirements (use tf2) 2022-04-26 11:51:22 +02:00
70e7316907 🐛 Fix ocrd core requirement
eynollah requires at ocrd >= 2.22.0 for the resource resolving code,
otherwise it fails with an AttributeError. Fix this by bumping up the
requirement.

I bumped it to 2.23.3 so core *also* includes the latest model resource
for eynollah.
2021-04-22 20:06:31 +02:00
Konstantin Baierer
da563519ec require setuptools >= 50 2021-02-05 13:34:17 +01:00
Konstantin Baierer
916e0a1870 restrict keras version to < 2.4 2021-02-05 10:23:24 +01:00
vahidrezanezhad
e3a55721f1
Update requirements.txt 2020-12-15 15:30:30 +01:00
Clemens Neudecker
a3e55582f7
tensorflow-gpu instead of tensorflow 2020-12-04 21:16:58 +01:00
Konstantin Baierer
30829bc9b2 seaborn not currently used 2020-12-04 13:52:03 +01:00
Konstantin Baierer
f229907e41 extend setup.py, add Makefile, gitignore, requirements.txt 2020-11-20 17:48:06 +01:00