# Eynollah
> Document Layout Analysis, Binarization and OCR with Deep Learning and Heuristics
[](https://pypi.python.org/pypi/eynollah)
[](https://pypi.org/project/eynollah/)
[](https://github.com/qurator-spk/eynollah/actions/workflows/test-eynollah.yml)
[](https://github.com/qurator-spk/eynollah/actions/workflows/build-docker.yml)
[](https://opensource.org/license/apache-2-0/)
[](https://doi.org/10.1145/3604951.3605513)

## Features
* Document layout analysis using pixelwise segmentation models with support for 10 distinct segmentation classes:
* background, [page border](https://ocr-d.de/en/gt-guidelines/trans/lyRand.html), [text region](https://ocr-d.de/en/gt-guidelines/trans/lytextregion.html#textregionen__textregion_), [text line](https://ocr-d.de/en/gt-guidelines/pagexml/pagecontent_xsd_Complex_Type_pc_TextLineType.html), [header](https://ocr-d.de/en/gt-guidelines/trans/lyUeberschrift.html), [image](https://ocr-d.de/en/gt-guidelines/trans/lyBildbereiche.html), [separator](https://ocr-d.de/en/gt-guidelines/trans/lySeparatoren.html), [marginalia](https://ocr-d.de/en/gt-guidelines/trans/lyMarginalie.html), [initial](https://ocr-d.de/en/gt-guidelines/trans/lyInitiale.html), [table](https://ocr-d.de/en/gt-guidelines/trans/lyTabellen.html)
* Textline segmentation to bounding boxes or polygons (contours) including for curved lines and vertical text
* Document image binarization with pixelwise segmentation or hybrid CNN-Transformer models
* Text recognition (OCR) with CNN-RNN or TrOCR models
* Detection of reading order (left-to-right or right-to-left) using heuristics or trainable models
* Output in [PAGE-XML](https://github.com/PRImA-Research-Lab/PAGE-XML)
* [OCR-D](https://github.com/qurator-spk/eynollah#use-as-ocr-d-processor) interface
:warning: Development is focused on achieving the best quality of results for a wide variety of historical
documents using a combination of multiple deep learning models and heuristics; therefore processing can be slow.
## Installation
Python `3.8-3.11` with Tensorflow `<2.13` on Linux are currently supported.
For (limited) GPU support the CUDA toolkit needs to be installed.
A working config is CUDA `11.8` with cuDNN `8.6`.
You can either install from PyPI
```
pip install eynollah
```
or clone the repository, enter it and install (editable) with
```
git clone git@github.com:qurator-spk/eynollah.git
cd eynollah; pip install -e .
```
Alternatively, you can run `make install` or `make install-dev` for editable installation.
To also install the dependencies for the OCR engines:
```
pip install "eynollah[OCR]"
# or
make install EXTRAS=OCR
```
### Docker
Use
```
docker pull ghcr.io/qurator-spk/eynollah:latest
```
When using Eynollah with Docker, see [`docker.md`](https://github.com/qurator-spk/eynollah/tree/main/docs/docker.md).
## Models
Pretrained models can be downloaded from [Zenodo](https://zenodo.org/records/17194824) or [Hugging Face](https://huggingface.co/SBB?search_models=eynollah).
For model documentation and model cards, see [`models.md`](https://github.com/qurator-spk/eynollah/tree/main/docs/models.md).
## Training
To train your own model with Eynollah, see [`train.md`](https://github.com/qurator-spk/eynollah/tree/main/docs/train.md) and use the tools in the [`train`](https://github.com/qurator-spk/eynollah/tree/main/train) folder.
## Usage
Eynollah supports five use cases:
1. [layout analysis (segmentation)](#layout-analysis),
2. [binarization](#binarization),
3. [image enhancement](#image-enhancement),
4. [text recognition (OCR)](#ocr), and
5. [reading order detection](#reading-order-detection).
### Layout Analysis
The layout analysis module is responsible for detecting layout elements, identifying text lines, and determining reading
order using heuristic methods or a [pretrained model](https://github.com/qurator-spk/eynollah#machine-based-reading-order).
The command-line interface for layout analysis can be called like this:
```sh
eynollah layout \
-i | -di \
-o