mirror of
https://github.com/qurator-spk/eynollah.git
synced 2026-02-21 00:41:56 +01:00
Merge branch 'main' into ro-fixes and resolve conflicts…
major conflicts resolved manually:
- branches for non-`light` segmentation already removed in main
- Keras/TF setup and no TF1 sessions, esp. in new ModelZoo
- changes to binarizer and its CLI (`mode`, `overwrite`, `run_single()`)
- writer: `build...` w/ kwargs instead of positional
- training for segmentation/binarization/enhancement tasks:
* drop unused `generate_data_from_folder()`
* simplify `preprocess_imgs()`: turn `preprocess_img()`, `get_patches()`
and `get_patches_num_scale_new()` into generators, only writing
result files in the caller (top-level loop) instead of passing
output directories and file counter
- training for new OCR task:
* `train`: put keys into additional `config_params` where they belong,
resp. (conditioned under existing keys), and w/ better documentation
* `train`: add new keys as kwargs to `run()` to make usable
* `utils`: instead of custom data loader `data_gen_ocr()`, re-use
existing `preprocess_imgs()` (for cfg capture and top-level loop),
but extended w/ new kwargs and calling new `preprocess_img_ocr()`;
the latter as single-image generator (also much simplified)
* `train`: use tf.data loader pipeline from that generator w/ standard
mechanisms for batching, shuffling, prefetching etc.
* `utils` and `train`: instead of `vectorize_label`, use `Dataset.padded_batch`
* add TensorBoard callback and re-use our checkpoint callback
* also use standard Keras top-level loop for training
still problematic (substantially unresolved):
- `Patches` now only w/ fixed implicit size
(ignoring training config params)
- `PatchEncoder` now only w/ fixed implicit num patches and projection dim
(ignoring training config params)
This commit is contained in:
commit
27f43c175f
77 changed files with 5597 additions and 4952 deletions
|
|
@ -1,3 +1,41 @@
|
|||
# Prerequisistes
|
||||
|
||||
## 1. Install Eynollah with training dependencies
|
||||
|
||||
Clone the repository and install eynollah along with the dependencies necessary for training:
|
||||
|
||||
```sh
|
||||
git clone https://github.com/qurator-spk/eynollah
|
||||
cd eynollah
|
||||
pip install '.[training]'
|
||||
```
|
||||
|
||||
## 2. Pretrained encoder
|
||||
|
||||
Download our pretrained weights and add them to a `train/pretrained_model` folder:
|
||||
|
||||
```sh
|
||||
cd train
|
||||
wget -O pretrained_model.tar.gz https://zenodo.org/records/17243320/files/pretrained_model_v0_5_1.tar.gz?download=1
|
||||
tar xf pretrained_model.tar.gz
|
||||
```
|
||||
|
||||
## 3. Example data
|
||||
|
||||
### Binarization
|
||||
A small sample of training data for binarization experiment can be found on [Zenodo](https://zenodo.org/records/17243320/files/training_data_sample_binarization_v0_5_1.tar.gz?download=1),
|
||||
which contains `images` and `labels` folders.
|
||||
|
||||
## 4. Helpful tools
|
||||
|
||||
* [`pagexml2img`](https://github.com/qurator-spk/page2img)
|
||||
> Tool to extract 2-D or 3-D RGB images from PAGE-XML data. In the former case, the output will be 1 2-D image array which each class has filled with a pixel value. In the case of a 3-D RGB image,
|
||||
each class will be defined with a RGB value and beside images, a text file of classes will also be produced.
|
||||
* [`cocoSegmentationToPng`](https://github.com/nightrome/cocostuffapi/blob/17acf33aef3c6cc2d6aca46dcf084266c2778cf0/PythonAPI/pycocotools/cocostuffhelper.py#L130)
|
||||
> Convert COCO GT or results for a single image to a segmentation map and write it to disk.
|
||||
* [`ocrd-segment-extract-pages`](https://github.com/OCR-D/ocrd_segment/blob/master/ocrd_segment/extract_pages.py)
|
||||
> Extract region classes and their colours in mask (pseg) images. Allows the color map as free dict parameter, and comes with a default that mimics PageViewer's coloring for quick debugging; it also warns when regions do overlap.
|
||||
|
||||
# Training documentation
|
||||
|
||||
This document aims to assist users in preparing training datasets, training models, and
|
||||
|
|
|
|||
Loading…
Add table
Add a link
Reference in a new issue