From e5254dc6c5bfcf2ee6d7b2b8636c14e32674f12f Mon Sep 17 00:00:00 2001
From: cneud <952378+cneud@users.noreply.github.com>
Date: Mon, 20 Oct 2025 22:39:54 +0200
Subject: [PATCH] integrate training docs

---
 docs/train.md   | 38 ++++++++++++++++++++++++++++++++++++++
 train/README.md | 43 -------------------------------------------
 2 files changed, 38 insertions(+), 43 deletions(-)
 delete mode 100644 train/README.md

diff --git a/docs/train.md b/docs/train.md
index 252bead..ffa39a9 100644
--- a/docs/train.md
+++ b/docs/train.md
@@ -1,3 +1,41 @@
+# Prerequisistes
+
+## 1. Install Eynollah with training dependencies
+
+Clone the repository and install eynollah along with the dependencies necessary for training:
+
+```sh
+git clone https://github.com/qurator-spk/eynollah
+cd eynollah
+pip install '.[training]'
+```
+
+## 2. Pretrained encoder
+
+Download our pretrained weights and add them to a `train/pretrained_model` folder: 
+
+```sh
+cd train
+wget -O pretrained_model.tar.gz https://zenodo.org/records/17243320/files/pretrained_model_v0_5_1.tar.gz?download=1
+tar xf pretrained_model.tar.gz
+```
+
+## 3. Example data
+
+### Binarization
+A small sample of training data for binarization experiment can be found on [Zenodo](https://zenodo.org/records/17243320/files/training_data_sample_binarization_v0_5_1.tar.gz?download=1),
+which contains `images` and `labels` folders.
+
+## 4. Helpful tools
+
+* [`pagexml2img`](https://github.com/qurator-spk/page2img)
+> Tool to extract 2-D or 3-D RGB images from PAGE-XML data. In the former case, the output will be 1 2-D image array which each class has filled with a pixel value. In the case of a 3-D RGB image, 
+each class will be defined with a RGB value and beside images, a text file of classes will also be produced.
+* [`cocoSegmentationToPng`](https://github.com/nightrome/cocostuffapi/blob/17acf33aef3c6cc2d6aca46dcf084266c2778cf0/PythonAPI/pycocotools/cocostuffhelper.py#L130)
+> Convert COCO GT or results for a single image to a segmentation map and write it to disk.
+* [`ocrd-segment-extract-pages`](https://github.com/OCR-D/ocrd_segment/blob/master/ocrd_segment/extract_pages.py)
+> Extract region classes and their colours in mask (pseg) images. Allows the color map as free dict parameter, and comes with a default that mimics PageViewer's coloring for quick debugging; it also warns when regions do overlap.
+
 # Training documentation
 
 This document aims to assist users in preparing training datasets, training models, and
diff --git a/train/README.md b/train/README.md
deleted file mode 100644
index d270542..0000000
--- a/train/README.md
+++ /dev/null
@@ -1,43 +0,0 @@
-# Training eynollah
-
-This README explains the technical details of how to set up and run training, for detailed information on parameterization, see [`docs/train.md`](../docs/train.md)
-
-## Introduction
-
-This folder contains the source code for training an encoder model for document image segmentation.
-
-## Installation
-
-Clone the repository and install eynollah along with the dependencies necessary for training:
-
-```sh
-git clone https://github.com/qurator-spk/eynollah
-cd eynollah
-pip install '.[training]'
-```
-
-### Pretrained encoder
-
-Download our pretrained weights and add them to a `train/pretrained_model` folder:   
-
-```sh
-cd train
-wget -O pretrained_model.tar.gz https://zenodo.org/records/17243320/files/pretrained_model_v0_5_1.tar.gz?download=1
-tar xf pretrained_model.tar.gz
-```
-
-### Binarization training data
-
-A small sample of training data for binarization experiment can be found [on
-zenodo](https://zenodo.org/records/17243320/files/training_data_sample_binarization_v0_5_1.tar.gz?download=1),
-which contains `images` and `labels` folders.
-
-### Helpful tools
-
-* [`pagexml2img`](https://github.com/qurator-spk/page2img)
-> Tool to extract 2-D or 3-D RGB images from PAGE-XML data. In the former case, the output will be 1 2-D image array which each class has filled with a pixel value. In the case of a 3-D RGB image, 
-each class will be defined with a RGB value and beside images, a text file of classes will also be produced.
-* [`cocoSegmentationToPng`](https://github.com/nightrome/cocostuffapi/blob/17acf33aef3c6cc2d6aca46dcf084266c2778cf0/PythonAPI/pycocotools/cocostuffhelper.py#L130)
-> Convert COCO GT or results for a single image to a segmentation map and write it to disk.
-* [`ocrd-segment-extract-pages`](https://github.com/OCR-D/ocrd_segment/blob/master/ocrd_segment/extract_pages.py)
-> Extract region classes and their colours in mask (pseg) images. Allows the color map as free dict parameter, and comes with a default that mimics PageViewer's coloring for quick debugging; it also warns when regions do overlap.