From 6b65cea24ae351343812ca5f74c25eabb8788439 Mon Sep 17 00:00:00 2001 From: Clemens Neudecker <952378+cneud@users.noreply.github.com> Date: Fri, 20 Nov 2020 22:12:06 +0100 Subject: [PATCH] Update README.md --- README.md | 34 +++++++++++++++++++++++++++------- 1 file changed, 27 insertions(+), 7 deletions(-) diff --git a/README.md b/README.md index e4e948a..28037c8 100644 --- a/README.md +++ b/README.md @@ -1,9 +1,21 @@ -# Textline Detection -> Detect textlines in document images +# Eynollah +> Document Layout Analysis ## Introduction -This tool (eynollah) performs border, region and textline detection and scaling and enhancing from document image data and returns the results as [PAGE-XML](https://github.com/PRImA-Research-Lab/PAGE-XML). -The goal of this project is to extract textlines of a document in order to feed them to an OCR model. This is achieved by four successive stages as follows: +This tool (eynollah) performs document layout analysis (segmentation) from document image data and returns the results as [PAGE-XML](https://github.com/PRImA-Research-Lab/PAGE-XML). + +It can currently detect the following layout classes: +* Border +* Textregion +* Image +* Textline +* Separator +* Marginalia +* Initial + +The final goal is to feed the output to an OCR model. + +The tool uses a combination of various models and heuristics: * [Border detection](https://github.com/qurator-spk#border-detection) * [Layout detection](https://github.com/qurator-spk#layout-detection) * [Textline detection](https://github.com/qurator-spk#textline-detection) @@ -47,9 +59,17 @@ In order to run this tool you also need trained models. You can download our pre The basic command-line interface can be called like this: - eynollah -i -o -m -fl -ae -as -cl -si - -The tool does accepts and works better on original images (RGB format). + eynollah \ + -i \ + -o \ + -m \ + -fl \ + -ae \ + -as \ + -cl \ + -si + +The tool does accept and works better on original images (RGB format) than binarized images. ### How and where to use