From 2fc723d292093cdfb263e2d6681e478d7018b953 Mon Sep 17 00:00:00 2001 From: vahidrezanezhad Date: Wed, 22 Oct 2025 18:29:14 +0200 Subject: [PATCH] extend README --- README.md | 66 ++++++++++++++++++++++++++++++++++++++----------------- 1 file changed, 46 insertions(+), 20 deletions(-) diff --git a/README.md b/README.md index 3ba5086..c6ba0e4 100644 --- a/README.md +++ b/README.md @@ -91,24 +91,35 @@ eynollah layout \ The following options can be used to further configure the processing: -| option | description | -|-------------------|:-------------------------------------------------------------------------------| -| `-fl` | full layout analysis including all steps and segmentation classes | -| `-light` | lighter and faster but simpler method for main region detection and deskewing | -| `-tll` | this indicates the light textline and should be passed with light version | -| `-tab` | apply table detection | -| `-ae` | apply enhancement (the resulting image is saved to the output directory) | -| `-as` | apply scaling | -| `-cl` | apply contour detection for curved text lines instead of bounding boxes | -| `-ib` | apply binarization (the resulting image is saved to the output directory) | -| `-ep` | enable plotting (MUST always be used with `-sl`, `-sd`, `-sa`, `-si` or `-ae`) | -| `-eoi` | extract only images to output directory (other processing will not be done) | -| `-ho` | ignore headers for reading order dectection | -| `-si ` | save image regions detected to this directory | -| `-sd ` | save deskewed image to this directory | -| `-sl ` | save layout prediction as plot to this directory | -| `-sp ` | save cropped page image to this directory | -| `-sa ` | save all (plot, enhanced/binary image, layout) to this directory | +| option | description | +|-------------------|:------------------------------------------------------------------------------- | +| `-fl` | full layout analysis including all steps and segmentation classes (recommended) | +| `-light` | lighter and faster but simpler method for main region detection and deskewing (recommended) | +| `-tll` | this indicates the light textline and should be passed with light version (recommended) | +| `-tab` | apply table detection | +| `-ae` | apply enhancement (the resulting image is saved to the output directory) | +| `-as` | apply scaling | +| `-cl` | apply contour detection for curved text lines instead of bounding boxes | +| `-ib` | apply binarization (the resulting image is saved to the output directory) | +| `-ep` | enable plotting (MUST always be used with `-sl`, `-sd`, `-sa`, `-si` or `-ae`) | +| `-eoi` | extract only images to output directory (other processing will not be done) | +| `-ho` | ignore headers for reading order dectection | +| `-si ` | save image regions detected to this directory | +| `-sd ` | save deskewed image to this directory | +| `-sl ` | save layout prediction as plot to this directory | +| `-sp ` | save cropped page image to this directory | +| `-sa ` | save all (plot, enhanced/binary image, layout) to this directory | +| `-thart` | threshold of artifical class in the case of textline detection. The default value is 0.1 | +| `-tharl` | threshold of artifical class in the case of layout detection. The default value is 0.1 | +| `-ocr` | do ocr | +| `-tr` | apply transformer ocr. Default model is a CNN-RNN model | +| `-bs_ocr` | ocr inference batch size. Default bs for trocr and cnn_rnn models are 2 and 8 respectively | +| `-ncu` | upper limit of columns in document image | +| `-ncl` | lower limit of columns in document image | +| `-slro` | skip layout detection and reading order | +| `-romb` | apply machine based reading order detection | +| `-ipe` | ignore page extraction | + If no further option is set, the tool performs layout detection of main regions (background, text, images, separators and marginals). @@ -124,7 +135,7 @@ The command-line interface for binarization can be called like this: eynollah binarization \ -i | -di \ -o \ - -m \ + -m ``` ### OCR @@ -138,9 +149,24 @@ eynollah ocr \ -i | -di \ -dx \ -o \ - -m | --model_name \ + -m | --model_name ``` +The following options can be used to further configure the ocr processing: + +| option | description | +|-------------------|:------------------------------------------------------------------------------- | +| `-dib` | directory of bins(files type must be '.png'). Prediction with both RGB and bins. | +| `-doit` | Directory containing output images rendered with the predicted text | +| `--model_name` | Specific model file path to use for OCR | +| `-trocr` | transformer ocr will be applied, otherwise cnn_rnn model | +| `-etit` | textlines images and text in xml will be exported into output dir (OCR training data) | +| `-nmtc` | cropped textline images will not be masked with textline contour | +| `-bs` | ocr inference batch size. Default bs for trocr and cnn_rnn models are 2 and 8 respectively | +| `-ds_pref` | add an abbrevation of dataset name to generated training data | +| `-min_conf` | minimum OCR confidence value. OCRs with textline conf lower than this will be ignored | + + ### Machine-based-reading-order The machine-based reading-order module employs a pretrained model to identify the reading order from layouts represented in PAGE-XML files.