From e5254dc6c5bfcf2ee6d7b2b8636c14e32674f12f Mon Sep 17 00:00:00 2001 From: cneud <952378+cneud@users.noreply.github.com> Date: Mon, 20 Oct 2025 22:39:54 +0200 Subject: [PATCH] integrate training docs --- docs/train.md | 38 ++++++++++++++++++++++++++++++++++++++ train/README.md | 43 ------------------------------------------- 2 files changed, 38 insertions(+), 43 deletions(-) delete mode 100644 train/README.md diff --git a/docs/train.md b/docs/train.md index 252bead..ffa39a9 100644 --- a/docs/train.md +++ b/docs/train.md @@ -1,3 +1,41 @@ +# Prerequisistes + +## 1. Install Eynollah with training dependencies + +Clone the repository and install eynollah along with the dependencies necessary for training: + +```sh +git clone https://github.com/qurator-spk/eynollah +cd eynollah +pip install '.[training]' +``` + +## 2. Pretrained encoder + +Download our pretrained weights and add them to a `train/pretrained_model` folder: + +```sh +cd train +wget -O pretrained_model.tar.gz https://zenodo.org/records/17243320/files/pretrained_model_v0_5_1.tar.gz?download=1 +tar xf pretrained_model.tar.gz +``` + +## 3. Example data + +### Binarization +A small sample of training data for binarization experiment can be found on [Zenodo](https://zenodo.org/records/17243320/files/training_data_sample_binarization_v0_5_1.tar.gz?download=1), +which contains `images` and `labels` folders. + +## 4. Helpful tools + +* [`pagexml2img`](https://github.com/qurator-spk/page2img) +> Tool to extract 2-D or 3-D RGB images from PAGE-XML data. In the former case, the output will be 1 2-D image array which each class has filled with a pixel value. In the case of a 3-D RGB image, +each class will be defined with a RGB value and beside images, a text file of classes will also be produced. +* [`cocoSegmentationToPng`](https://github.com/nightrome/cocostuffapi/blob/17acf33aef3c6cc2d6aca46dcf084266c2778cf0/PythonAPI/pycocotools/cocostuffhelper.py#L130) +> Convert COCO GT or results for a single image to a segmentation map and write it to disk. +* [`ocrd-segment-extract-pages`](https://github.com/OCR-D/ocrd_segment/blob/master/ocrd_segment/extract_pages.py) +> Extract region classes and their colours in mask (pseg) images. Allows the color map as free dict parameter, and comes with a default that mimics PageViewer's coloring for quick debugging; it also warns when regions do overlap. + # Training documentation This document aims to assist users in preparing training datasets, training models, and diff --git a/train/README.md b/train/README.md deleted file mode 100644 index d270542..0000000 --- a/train/README.md +++ /dev/null @@ -1,43 +0,0 @@ -# Training eynollah - -This README explains the technical details of how to set up and run training, for detailed information on parameterization, see [`docs/train.md`](../docs/train.md) - -## Introduction - -This folder contains the source code for training an encoder model for document image segmentation. - -## Installation - -Clone the repository and install eynollah along with the dependencies necessary for training: - -```sh -git clone https://github.com/qurator-spk/eynollah -cd eynollah -pip install '.[training]' -``` - -### Pretrained encoder - -Download our pretrained weights and add them to a `train/pretrained_model` folder: - -```sh -cd train -wget -O pretrained_model.tar.gz https://zenodo.org/records/17243320/files/pretrained_model_v0_5_1.tar.gz?download=1 -tar xf pretrained_model.tar.gz -``` - -### Binarization training data - -A small sample of training data for binarization experiment can be found [on -zenodo](https://zenodo.org/records/17243320/files/training_data_sample_binarization_v0_5_1.tar.gz?download=1), -which contains `images` and `labels` folders. - -### Helpful tools - -* [`pagexml2img`](https://github.com/qurator-spk/page2img) -> Tool to extract 2-D or 3-D RGB images from PAGE-XML data. In the former case, the output will be 1 2-D image array which each class has filled with a pixel value. In the case of a 3-D RGB image, -each class will be defined with a RGB value and beside images, a text file of classes will also be produced. -* [`cocoSegmentationToPng`](https://github.com/nightrome/cocostuffapi/blob/17acf33aef3c6cc2d6aca46dcf084266c2778cf0/PythonAPI/pycocotools/cocostuffhelper.py#L130) -> Convert COCO GT or results for a single image to a segmentation map and write it to disk. -* [`ocrd-segment-extract-pages`](https://github.com/OCR-D/ocrd_segment/blob/master/ocrd_segment/extract_pages.py) -> Extract region classes and their colours in mask (pseg) images. Allows the color map as free dict parameter, and comes with a default that mimics PageViewer's coloring for quick debugging; it also warns when regions do overlap.