From 37ad054dfaffa4b01ec91a0dfc6cd7ae0d177466 Mon Sep 17 00:00:00 2001 From: Clemens Neudecker <952378+cneud@users.noreply.github.com> Date: Mon, 30 Nov 2020 14:58:17 +0100 Subject: [PATCH] Update README.md --- README.md | 30 ++++++++++++++---------------- 1 file changed, 14 insertions(+), 16 deletions(-) diff --git a/README.md b/README.md index cb73823..75b219d 100644 --- a/README.md +++ b/README.md @@ -16,7 +16,7 @@ It can currently detect the following layout classes/elements: * Marginalia * Initial -In addition, the tool can be used to determine the _Reading Order_ of regions. The final goal is to feed the output to an OCR model. +In addition, the tool can be used to detect the _Reading Order_ of regions. The final goal is to feed the output to an OCR model. The tool uses a combination of various models and heuristics: * [Border detection](https://github.com/qurator-spk/eynollah#border-detection) @@ -66,30 +66,28 @@ The basic command-line interface can be called like this: -i \ -o \ -m \ - -fl \ - -ae \ - -as \ - -cl \ - -si + -fl \ + -ae \ + -as \ + -cl \ + -si The tool does accept and works better on original images (RGB format) than binarized images. ### How to use -First of all, for this model we have trained 9 models which are doing different jobs like size detection (or column classifier), enhancing, page extraction, main layout detection, full layout detection and textline detetction. But this does not mean all those 9 models are needed for each document. Based on document and parameters it can be different. It is worthy to mention that with this tool we are able to detect reading order of text regions for simple documents (I will not go in detail with order of reading since it is a complex issue and many factors play a role about it). +First of all, this model makes use of up to 9 trained models which are responsible for different operations like size detection, column classification, image enhancement, page extraction, main layout detection, full layout detection and textline detection. But this does not mean that all 9 models are always required for every document. Based on the document characteristics and parameters specified, different scenarios can be applied. -* If none of parameters is set to true, this tool will try to do a layout detection of main regions (background, text, images, separators and marginals). Actually, advantage of this tool is that it has tried to extract main text regions separately as much as possible. +* If none of the parameters is set to `true`, the tool will perform a layout detection of main regions (background, text, images, separators and marginals). An advantage of this tool is that it tries to extract main text regions separately as much as possible. -* If you set `-ae` (allow enhancement) paremeter to `true`, this tool would check first dpi of document and if it is less than 300 then our tool first will resize it and then enhancement will occur. In fact enhancemnet can take place even without this option but by setting this option to true layout (better say xml data) will be written on resized and enhanced image instead of original image. +* If you set `-ae` (allow image enhancement) parameter to `true`, the tool will first check the ppi (pixel-per-inch) of the image and when it is less than 300, the tool will resize it and only then image enhancement will occur. Image enhancement can also take place without this option, but by setting this option to `true`, the layout xml data (e.g. coordinates) will be based on the resized and enhanced image instead of the original image. -* Some documents quality are really good but their scale is extremly big and therefore the performance of tool decreases. In those cases you can set `-as` (allow scaling) to `true`. With this option our tool first would try to scale image and then layout detection process will begin. +* For some documents, while the quality is good, their scale is extremly large and the performance of tool decreases. In such cases you can set `-as` (allow scaling) to `true`. With this option enabled, the tool will try to rescale the image and only then the layout detection process will begin. -* If you care about drop capitals and headings you can set `-fl` (full layout) to `true`. As we can see in the case of full layout we can detect 7 elements of document. +* If you care about drop capitals (initials) and headings, you can set `-fl` (full layout) to `true`. As we can see in the case of full layout, we can currently distinguish 7 document layout classes/elements. -* We face documents which include curved header or curved lines and it is abvious that a rectangle bounding boxes for textlines would never be a great option. So, we have developed an option which can try to find contours of those curvy textlines. You can set `-cl` (curved lines) to `true` to have this option. Be carefull that this increase the time, the tool needs to go through document. - -* If you want to crop and save image regions inside document just provide a directory with this parameter, `-si` (save images). - -* At the end this tool still needs to be optimized and developed. So if any problems occur or this tool performance does not meet your expectation, you can provide us your worthy feedback. +* In cases where the documents include curved headers or curved lines it is obvious that rectangular bounding boxes for textlines will not be a great option. For this, we have developed an option which tries to find contours of the curvy textlines. You can set `-cl` (curved lines) to `true` to enable this option. Be advised that this will increase the time needed for the tool to process the document. +* If you want to crop and save image regions inside the document, just provide a directory with the parameter, `-si` (save images). +* This tool is actively being developed. If any problems occur or the performance does not meet your expectations, we welcome your feedback.