From 571dc84c3f0ef66f41c63e9ad722fae6a1668e5e Mon Sep 17 00:00:00 2001 From: Clemens Neudecker <952378+cneud@users.noreply.github.com> Date: Mon, 28 Mar 2022 13:15:35 +0200 Subject: [PATCH 01/34] README.md cleanup / restructuring --- README.md | 95 ++++++------------------------------------------------- 1 file changed, 9 insertions(+), 86 deletions(-) diff --git a/README.md b/README.md index c52560b..a9a86e8 100644 --- a/README.md +++ b/README.md @@ -1,58 +1,8 @@ # Eynollah -> Document Layout Analysis +> Perform document layout analysis (segmentation) from image data and return the results as [PAGE-XML](https://github.com/PRImA-Research-Lab/PAGE-XML). ![](https://user-images.githubusercontent.com/952378/102350683-8a74db80-3fa5-11eb-8c7e-f743f7d6eae2.jpg) -## Introduction -This tool performs document layout analysis (segmentation) from image data and returns the results as [PAGE-XML](https://github.com/PRImA-Research-Lab/PAGE-XML). - -It can currently detect the following layout classes/elements: -* [Border](https://ocr-d.de/en/gt-guidelines/pagexml/pagecontent_xsd_Complex_Type_pc_BorderType.html) -* [Textregion](https://ocr-d.de/en/gt-guidelines/pagexml/pagecontent_xsd_Complex_Type_pc_TextRegionType.html) -* [Textline](https://ocr-d.de/en/gt-guidelines/pagexml/pagecontent_xsd_Complex_Type_pc_TextLineType.html) -* [Image](https://ocr-d.de/en/gt-guidelines/pagexml/pagecontent_xsd_Complex_Type_pc_ImageRegionType.html) -* [Separator](https://ocr-d.de/en/gt-guidelines/pagexml/pagecontent_xsd_Complex_Type_pc_SeparatorRegionType.html) -* [Marginalia](https://ocr-d.de/en/gt-guidelines/trans/lyMarginalie.html) -* [Initial (Drop Capital)](https://ocr-d.de/en/gt-guidelines/trans/lyInitiale.html) - -In addition, the tool can be used to detect the _[ReadingOrder](https://ocr-d.de/en/gt-guidelines/trans/lyLeserichtung.html)_ of regions. The final goal is to feed the output to an OCR model. - -The tool uses a combination of various models and heuristics (see flowchart below for the different stages and how they interact): -* [Border detection](https://github.com/qurator-spk/eynollah#border-detection) -* [Layout detection](https://github.com/qurator-spk/eynollah#layout-detection) -* [Textline detection](https://github.com/qurator-spk/eynollah#textline-detection) -* [Image enhancement](https://github.com/qurator-spk/eynollah#Image_enhancement) -* [Scale classification](https://github.com/qurator-spk/eynollah#Scale_classification) -* [Heuristic methods](https://https://github.com/qurator-spk/eynollah#heuristic-methods) - -The first three stages are based on [pixel-wise segmentation](https://github.com/qurator-spk/sbb_pixelwise_segmentation). - -![](https://user-images.githubusercontent.com/952378/100619946-1936f680-331e-11eb-9297-6e8b4cab3c16.png) - -## Border detection -For the purpose of text recognition (OCR) and in order to avoid noise being introduced from texts outside the printspace, one first needs to detect the border of the printed frame. This is done by a binary pixel-wise-segmentation model trained on a dataset of 2,000 documents where about 1,200 of them come from the [dhSegment](https://github.com/dhlab-epfl/dhSegment/) project (you can download the dataset from [here](https://github.com/dhlab-epfl/dhSegment/releases/download/v0.2/pages.zip)) and the remainder having been annotated in SBB. For border detection, the model needs to be fed with the whole image at once rather than separated in patches. - -## Layout detection -As a next step, text regions need to be identified by means of layout detection. Again a pixel-wise segmentation model was trained on 131 labeled images from the SBB digital collections, including some data augmentation. Since the target of this tool are historical documents, we consider as main region types text regions, separators, images, tables and background - each with their own subclasses, e.g. in the case of text regions, subclasses like header/heading, drop capital, main body text etc. While it would be desirable to detect and classify each of these classes in a granular way, there are also limitations due to having a suitably large and balanced training set. Accordingly, the current version of this tool is focussed on the main region types background, text region, image and separator. - -## Textline detection -In a subsequent step, binary pixel-wise segmentation is used again to classify pixels in a document that constitute textlines. For textline segmentation, a model was initially trained on documents with only one column/block of text and some augmentation with regard to scaling. By fine-tuning the parameters also for multi-column documents, additional training data was produced that resulted in a much more robust textline detection model. - -## Image enhancement -This is an image to image model which input was low quality of an image and label was actually the original image. For this one we did not have any GT, so we decreased the quality of documents in SBB and then feed them into model. - -## Scale classification -This is simply an image classifier which classifies images based on their scales or better to say based on their number of columns. - -## Heuristic methods -Some heuristic methods are also employed to further improve the model predictions: -* After border detection, the largest contour is determined by a bounding box, and the image cropped to these coordinates. -* For text region detection, the image is scaled up to make it easier for the model to detect background space between text regions. -* A minimum area is defined for text regions in relation to the overall image dimensions, so that very small regions that are noise can be filtered out. -* Deskewing is applied on the text region level (due to regions having different degrees of skew) in order to improve the textline segmentation result. -* After deskewing, a calculation of the pixel distribution on the X-axis allows the separation of textlines (foreground) and background pixels. -* Finally, using the derived coordinates, bounding boxes are determined for each textline. - ## Installation `pip install .` or @@ -66,13 +16,17 @@ Alternatively, you can also use `make` with these targets: ### Models -In order to run this tool you also need trained models. You can download our pretrained models from [qurator-data.de](https://qurator-data.de/eynollah/). +In order to run this tool you need trained models. You can download our pretrained models from [qurator-data.de](https://qurator-data.de/eynollah/). Alternatively, running `make models` will download and extract models to `$(PWD)/models_eynollah`. +### Training + +In case you want to train your own model to use with Eynollah, have a look at [sbb_pixelwise_segmentation](https://github.com/qurator-spk/sbb_pixelwise_segmentation). + ## Usage -The basic command-line interface can be called like this: +The command-line interface can be called like this: ```sh eynollah \ @@ -94,37 +48,6 @@ eynollah \ ``` -The tool does accept and works better on original images (RGB format) than binarized images. - -### `--full-layout` vs `--no-full-layout` - -Here are the difference in elements detected depending on the `--full-layout`/`--no-full-layout` command line flags: - -| | `--full-layout` | `--no-full-layout` | -| --- | --- | --- | -| reading order | x | x | -| header regions | x | - | -| text regions | x | x | -| text regions / text line | x | x | -| drop-capitals | x | - | -| marginals | x | x | -| marginals / text line | x | x | -| image region | x | x | - -### How to use - -First, this model makes use of up to 9 trained models which are responsible for different operations like size detection, column classification, image enhancement, page extraction, main layout detection, full layout detection and textline detection.That does not mean that all 9 models are always required for every document. Based on the document characteristics and parameters specified, different scenarios can be applied. - -* If none of the parameters is set to `true`, the tool will perform a layout detection of main regions (background, text, images, separators and marginals). An advantage of this tool is that it tries to extract main text regions separately as much as possible. - -* If you set `-ae` (**a**llow image **e**nhancement) parameter to `true`, the tool will first check the ppi (pixel-per-inch) of the image and when it is less than 300, the tool will resize it and only then image enhancement will occur. Image enhancement can also take place without this option, but by setting this option to `true`, the layout xml data (e.g. coordinates) will be based on the resized and enhanced image instead of the original image. - -* For some documents, while the quality is good, their scale is very large, and the performance of tool decreases. In such cases you can set `-as` (**a**llow **s**caling) to `true`. With this option enabled, the tool will try to rescale the image and only then the layout detection process will begin. - -* If you care about drop capitals (initials) and headings, you can set `-fl` (**f**ull **l**ayout) to `true`. With this setting, the tool can currently distinguish 7 document layout classes/elements. - -* In cases where the document includes curved headers or curved lines, rectangular bounding boxes for textlines will not be a great option. In such cases it is strongly recommended setting the flag `-cl` (**c**urved **l**ines) to `true` to find contours of curved lines instead of rectangular bounding boxes. Be advised that enabling this option increases the processing time of the tool. - -* To crop and save image regions inside the document, set the parameter `-si` (**s**ave **i**mages) to true and provide a directory path to store the extracted images. +The tool performs better with RGB images than greyscale/binarized images. -* This tool is actively being developed. If problems occur, or the performance does not meet your expectations, we welcome your feedback via [issues](https://github.com/qurator-spk/eynollah/issues). +Additional documentation can be found in the [wiki](https://github.com/qurator-spk/eynollah/wiki). From 5dafa2095bbfb0aafb151a24fcab1afdd983d65f Mon Sep 17 00:00:00 2001 From: Clemens Neudecker <952378+cneud@users.noreply.github.com> Date: Tue, 29 Mar 2022 17:18:12 +0200 Subject: [PATCH 02/34] use
instead of wiki --- README.md | 125 +++++++++++++++++++++++++++++++++++++++++++++++++++++- 1 file changed, 124 insertions(+), 1 deletion(-) diff --git a/README.md b/README.md index a9a86e8..a35f972 100644 --- a/README.md +++ b/README.md @@ -50,4 +50,127 @@ eynollah \ The tool performs better with RGB images than greyscale/binarized images. -Additional documentation can be found in the [wiki](https://github.com/qurator-spk/eynollah/wiki). +## Documentation + +
+ click to expand/collapse + +## Region types + +
+ click to expand/collapse
+ +Eynollah can currently be used to detect the following region types/elements: +* [Border](https://ocr-d.de/en/gt-guidelines/pagexml/pagecontent_xsd_Complex_Type_pc_BorderType.html) +* [Textregion](https://ocr-d.de/en/gt-guidelines/pagexml/pagecontent_xsd_Complex_Type_pc_TextRegionType.html) +* [Textline](https://ocr-d.de/en/gt-guidelines/pagexml/pagecontent_xsd_Complex_Type_pc_TextLineType.html) +* [Image](https://ocr-d.de/en/gt-guidelines/pagexml/pagecontent_xsd_Complex_Type_pc_ImageRegionType.html) +* [Separator](https://ocr-d.de/en/gt-guidelines/pagexml/pagecontent_xsd_Complex_Type_pc_SeparatorRegionType.html) +* [Marginalia](https://ocr-d.de/en/gt-guidelines/trans/lyMarginalie.html) +* [Initial (Drop Capital)](https://ocr-d.de/en/gt-guidelines/trans/lyInitiale.html) + +In addition, the tool can detect the [ReadingOrder](https://ocr-d.de/en/gt-guidelines/trans/lyLeserichtung.html) of regions. The final goal is to feed the output to an OCR model. + +
+ +## Method description + +
+ click to expand/collapse
+ +Eynollah uses a combination of various models and heuristics (see flowchart below for the different stages and how they interact): +* [Border detection](https://github.com/qurator-spk/eynollah#border-detection) +* [Layout detection](https://github.com/qurator-spk/eynollah#layout-detection) +* [Textline detection](https://github.com/qurator-spk/eynollah#textline-detection) +* [Image enhancement](https://github.com/qurator-spk/eynollah#Image_enhancement) +* [Scale classification](https://github.com/qurator-spk/eynollah#Scale_classification) +* [Heuristic methods](https://https://github.com/qurator-spk/eynollah#heuristic-methods) + +The first three stages are based on [pixel-wise segmentation](https://github.com/qurator-spk/sbb_pixelwise_segmentation). + +![](https://user-images.githubusercontent.com/952378/100619946-1936f680-331e-11eb-9297-6e8b4cab3c16.png) + +### Border detection +For the purpose of text recognition (OCR) and in order to avoid noise being introduced from texts outside the printspace, one first needs to detect the border of the printed frame. This is done by a binary pixel-wise-segmentation model trained on a dataset of 2,000 documents where about 1,200 of them come from the [dhSegment](https://github.com/dhlab-epfl/dhSegment/) project (you can download the dataset from [here](https://github.com/dhlab-epfl/dhSegment/releases/download/v0.2/pages.zip)) and the remainder having been annotated in SBB. For border detection, the model needs to be fed with the whole image at once rather than separated in patches. + +### Layout detection +As a next step, text regions need to be identified by means of layout detection. Again a pixel-wise segmentation model was trained on 131 labeled images from the SBB digital collections, including some data augmentation. Since the target of this tool are historical documents, we consider as main region types text regions, separators, images, tables and background - each with their own subclasses, e.g. in the case of text regions, subclasses like header/heading, drop capital, main body text etc. While it would be desirable to detect and classify each of these classes in a granular way, there are also limitations due to having a suitably large and balanced training set. Accordingly, the current version of this tool is focussed on the main region types background, text region, image and separator. + +### Textline detection +In a subsequent step, binary pixel-wise segmentation is used again to classify pixels in a document that constitute textlines. For textline segmentation, a model was initially trained on documents with only one column/block of text and some augmentation with regard to scaling. By fine-tuning the parameters also for multi-column documents, additional training data was produced that resulted in a much more robust textline detection model. + +### Image enhancement +This is an image to image model which input was low quality of an image and label was actually the original image. For this one we did not have any GT, so we decreased the quality of documents in SBB and then feed them into model. + +### Scale classification +This is simply an image classifier which classifies images based on their scales or better to say based on their number of columns. + +### Heuristic methods +Some heuristic methods are also employed to further improve the model predictions: +* After border detection, the largest contour is determined by a bounding box, and the image cropped to these coordinates. +* For text region detection, the image is scaled up to make it easier for the model to detect background space between text regions. +* A minimum area is defined for text regions in relation to the overall image dimensions, so that very small regions that are noise can be filtered out. +* Deskewing is applied on the text region level (due to regions having different degrees of skew) in order to improve the textline segmentation result. +* After deskewing, a calculation of the pixel distribution on the X-axis allows the separation of textlines (foreground) and background pixels. +* Finally, using the derived coordinates, bounding boxes are determined for each textline. + +
+ +## Model description + +
+ click to expand/collapse
+ +TODO + +
+ +## How to use + +
+ click to expand/collapse
+ +First, this model makes use of up to 9 trained models which are responsible for different operations like size detection, column classification, image enhancement, page extraction, main layout detection, full layout detection and textline detection.That does not mean that all 9 models are always required for every document. Based on the document characteristics and parameters specified, different scenarios can be applied. + +* If none of the parameters is set to `true`, the tool will perform a layout detection of main regions (background, text, images, separators and marginals). An advantage of this tool is that it tries to extract main text regions separately as much as possible. + +* If you set `-ae` (**a**llow image **e**nhancement) parameter to `true`, the tool will first check the ppi (pixel-per-inch) of the image and when it is less than 300, the tool will resize it and only then image enhancement will occur. Image enhancement can also take place without this option, but by setting this option to `true`, the layout xml data (e.g. coordinates) will be based on the resized and enhanced image instead of the original image. + +* For some documents, while the quality is good, their scale is very large, and the performance of tool decreases. In such cases you can set `-as` (**a**llow **s**caling) to `true`. With this option enabled, the tool will try to rescale the image and only then the layout detection process will begin. + +* If you care about drop capitals (initials) and headings, you can set `-fl` (**f**ull **l**ayout) to `true`. With this setting, the tool can currently distinguish 7 document layout classes/elements. + +* In cases where the document includes curved headers or curved lines, rectangular bounding boxes for textlines will not be a great option. In such cases it is strongly recommended setting the flag `-cl` (**c**urved **l**ines) to `true` to find contours of curved lines instead of rectangular bounding boxes. Be advised that enabling this option increases the processing time of the tool. + +* To crop and save image regions inside the document, set the parameter `-si` (**s**ave **i**mages) to true and provide a directory path to store the extracted images. + +* This tool is actively being developed. If problems occur, or the performance does not meet your expectations, we welcome your feedback via [issues](https://github.com/qurator-spk/eynollah/issues). + +### `--full-layout` vs `--no-full-layout` + +Here are the difference in elements detected depending on the `--full-layout`/`--no-full-layout` command line flags: + +| | `--full-layout` | `--no-full-layout` | +| --- | --- | --- | +| reading order | x | x | +| header regions | x | - | +| text regions | x | x | +| text regions / text line | x | x | +| drop-capitals | x | - | +| marginals | x | x | +| marginals / text line | x | x | +| image region | x | x | + +### Use as OCR-D processor + +Eynollah ships with a CLI interface to be used as [OCR-D](https://ocr-d.de) processor. In this case, the source image file group with (preferably) RGB images should be used as input (the image provided by `@imageFilename` is passed on directly): + +`ocrd-eynollah-segment -I OCR-D-IMG -O SEG-LINE -P models` + + ### Eynollah "light" + + TODO + +
+ +
From aa64a54feb45de5086b1cd888ee1e15089631b73 Mon Sep 17 00:00:00 2001 From: Clemens Neudecker <952378+cneud@users.noreply.github.com> Date: Tue, 29 Mar 2022 17:19:18 +0200 Subject: [PATCH 03/34] markdown --- README.md | 22 +++++++++++----------- 1 file changed, 11 insertions(+), 11 deletions(-) diff --git a/README.md b/README.md index a35f972..deedcf7 100644 --- a/README.md +++ b/README.md @@ -55,7 +55,7 @@ The tool performs better with RGB images than greyscale/binarized images.
click to expand/collapse -## Region types +### Region types
click to expand/collapse
@@ -73,7 +73,7 @@ In addition, the tool can detect the [ReadingOrder](https://ocr-d.de/en/gt-guide
-## Method description +### Method description
click to expand/collapse
@@ -90,19 +90,19 @@ The first three stages are based on [pixel-wise segmentation](https://github.com ![](https://user-images.githubusercontent.com/952378/100619946-1936f680-331e-11eb-9297-6e8b4cab3c16.png) -### Border detection +#### Border detection For the purpose of text recognition (OCR) and in order to avoid noise being introduced from texts outside the printspace, one first needs to detect the border of the printed frame. This is done by a binary pixel-wise-segmentation model trained on a dataset of 2,000 documents where about 1,200 of them come from the [dhSegment](https://github.com/dhlab-epfl/dhSegment/) project (you can download the dataset from [here](https://github.com/dhlab-epfl/dhSegment/releases/download/v0.2/pages.zip)) and the remainder having been annotated in SBB. For border detection, the model needs to be fed with the whole image at once rather than separated in patches. ### Layout detection As a next step, text regions need to be identified by means of layout detection. Again a pixel-wise segmentation model was trained on 131 labeled images from the SBB digital collections, including some data augmentation. Since the target of this tool are historical documents, we consider as main region types text regions, separators, images, tables and background - each with their own subclasses, e.g. in the case of text regions, subclasses like header/heading, drop capital, main body text etc. While it would be desirable to detect and classify each of these classes in a granular way, there are also limitations due to having a suitably large and balanced training set. Accordingly, the current version of this tool is focussed on the main region types background, text region, image and separator. -### Textline detection +#### Textline detection In a subsequent step, binary pixel-wise segmentation is used again to classify pixels in a document that constitute textlines. For textline segmentation, a model was initially trained on documents with only one column/block of text and some augmentation with regard to scaling. By fine-tuning the parameters also for multi-column documents, additional training data was produced that resulted in a much more robust textline detection model. -### Image enhancement +#### Image enhancement This is an image to image model which input was low quality of an image and label was actually the original image. For this one we did not have any GT, so we decreased the quality of documents in SBB and then feed them into model. -### Scale classification +#### Scale classification This is simply an image classifier which classifies images based on their scales or better to say based on their number of columns. ### Heuristic methods @@ -116,7 +116,7 @@ Some heuristic methods are also employed to further improve the model prediction
-## Model description +### Model description
click to expand/collapse
@@ -125,7 +125,7 @@ TODO
-## How to use +### How to use
click to expand/collapse
@@ -146,7 +146,7 @@ First, this model makes use of up to 9 trained models which are responsible for * This tool is actively being developed. If problems occur, or the performance does not meet your expectations, we welcome your feedback via [issues](https://github.com/qurator-spk/eynollah/issues). -### `--full-layout` vs `--no-full-layout` +#### `--full-layout` vs `--no-full-layout` Here are the difference in elements detected depending on the `--full-layout`/`--no-full-layout` command line flags: @@ -161,13 +161,13 @@ Here are the difference in elements detected depending on the `--full-layout`/`- | marginals / text line | x | x | | image region | x | x | -### Use as OCR-D processor +#### Use as OCR-D processor Eynollah ships with a CLI interface to be used as [OCR-D](https://ocr-d.de) processor. In this case, the source image file group with (preferably) RGB images should be used as input (the image provided by `@imageFilename` is passed on directly): `ocrd-eynollah-segment -I OCR-D-IMG -O SEG-LINE -P models` - ### Eynollah "light" + #### Eynollah "light" TODO From 441c8566dda5cc2b37fd92a39236dc595a547298 Mon Sep 17 00:00:00 2001 From: Clemens Neudecker <952378+cneud@users.noreply.github.com> Date: Wed, 30 Mar 2022 17:05:04 +0200 Subject: [PATCH 04/34] additional details on OCR-D usage --- README.md | 8 +++++++- 1 file changed, 7 insertions(+), 1 deletion(-) diff --git a/README.md b/README.md index deedcf7..b20289d 100644 --- a/README.md +++ b/README.md @@ -163,9 +163,15 @@ Here are the difference in elements detected depending on the `--full-layout`/`- #### Use as OCR-D processor -Eynollah ships with a CLI interface to be used as [OCR-D](https://ocr-d.de) processor. In this case, the source image file group with (preferably) RGB images should be used as input (the image provided by `@imageFilename` is passed on directly): +Eynollah ships with a CLI interface to be used as [OCR-D](https://ocr-d.de) processor. In this case, the source image file group with (preferably) RGB images should be used as input like this: `ocrd-eynollah-segment -I OCR-D-IMG -O SEG-LINE -P models` + +In fact, the image referenced by `@imageFilename` in PAGE-XML is passed on directly to Eynollah as a processor, so that e.g. calling + +`ocrd-eynollah-segment -I OCR-D-IMG-BIN -O SEG-LINE -P models` + +would still use the original (RGB) image despite any binarization that may have occured in previous OCR-D processing steps #### Eynollah "light" From d19170035d36e9f0478dd9dfeb781bd55e017171 Mon Sep 17 00:00:00 2001 From: vahidrezanezhad Date: Mon, 4 Apr 2022 22:21:55 -0400 Subject: [PATCH 05/34] updating model directory --- Makefile | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/Makefile b/Makefile index 920f15b..39c8a9b 100644 --- a/Makefile +++ b/Makefile @@ -25,7 +25,7 @@ models_eynollah: models_eynollah.tar.gz tar xf models_eynollah.tar.gz models_eynollah.tar.gz: - wget 'https://qurator-data.de/eynollah/models_eynollah.tar.gz' + wget 'https://qurator-data.de/eynollah/2021-04-25/models_eynollah.tar.gz' # Install with pip install: From adf10942fa8516bbab5bd01649944f9613c24c96 Mon Sep 17 00:00:00 2001 From: vahid Date: Tue, 5 Apr 2022 07:47:55 -0400 Subject: [PATCH 06/34] issue #55 resolved --- qurator/eynollah/eynollah.py | 8 ++++---- 1 file changed, 4 insertions(+), 4 deletions(-) diff --git a/qurator/eynollah/eynollah.py b/qurator/eynollah/eynollah.py index 81c0b0c..d3253a6 100644 --- a/qurator/eynollah/eynollah.py +++ b/qurator/eynollah/eynollah.py @@ -546,7 +546,7 @@ class Eynollah: if img.shape[1] < img_width_model: img = resize_image(img, img.shape[0], img_width_model) - self.logger.info("Image dimensions: %sx%s", img_height_model, img_width_model) + self.logger.info("Patch size: %sx%s", img_height_model, img_width_model) margin = int(marginal_of_patch_percent * img_height_model) width_mid = img_width_model - 2 * margin height_mid = img_height_model - 2 * margin @@ -2316,7 +2316,7 @@ class Eynollah: num_col, num_col_classifier, img_only_regions, page_coord, image_page, mask_images, mask_lines, text_regions_p_1, cont_page, table_prediction = \ self.run_graphics_and_columns(text_regions_p_1, num_col_classifier, num_column_is_classified, erosion_hurts) self.logger.info("Graphics detection took %.1fs ", time.time() - t1) - self.logger.info('cont_page %s', cont_page) + #self.logger.info('cont_page %s', cont_page) if not num_col: self.logger.info("No columns detected, outputting an empty PAGE-XML") @@ -2355,7 +2355,7 @@ class Eynollah: if len(contours_only_text_parent) > 0: areas_cnt_text = np.array([cv2.contourArea(contours_only_text_parent[j]) for j in range(len(contours_only_text_parent))]) areas_cnt_text = areas_cnt_text / float(text_only.shape[0] * text_only.shape[1]) - self.logger.info('areas_cnt_text %s', areas_cnt_text) + #self.logger.info('areas_cnt_text %s', areas_cnt_text) contours_biggest = contours_only_text_parent[np.argmax(areas_cnt_text)] contours_only_text_parent = [contours_only_text_parent[jz] for jz in range(len(contours_only_text_parent)) if areas_cnt_text[jz] > min_con_area] areas_cnt_text_parent = [areas_cnt_text[jz] for jz in range(len(areas_cnt_text)) if areas_cnt_text[jz] > min_con_area] @@ -2445,7 +2445,7 @@ class Eynollah: cx_bigest_big, cy_biggest_big, _, _, _, _, _ = find_new_features_of_contours([contours_biggest]) cx_bigest, cy_biggest, _, _, _, _, _ = find_new_features_of_contours(contours_only_text_parent) - self.logger.debug('areas_cnt_text_parent %s', areas_cnt_text_parent) + #self.logger.debug('areas_cnt_text_parent %s', areas_cnt_text_parent) # self.logger.debug('areas_cnt_text_parent_d %s', areas_cnt_text_parent_d) # self.logger.debug('len(contours_only_text_parent) %s', len(contours_only_text_parent_d)) else: From f27ac155ae1362625f1d5fce9ffcb354ac6c4c30 Mon Sep 17 00:00:00 2001 From: "Gerber, Mike" Date: Wed, 6 Apr 2022 14:47:29 +0200 Subject: [PATCH 07/34] =?UTF-8?q?=F0=9F=A7=B9=20Downgrade=20"Patch=20size"?= =?UTF-8?q?=20log=20message=20to=20debug?= MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Fixes gh-55. --- qurator/eynollah/eynollah.py | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/qurator/eynollah/eynollah.py b/qurator/eynollah/eynollah.py index d3253a6..f8d7d09 100644 --- a/qurator/eynollah/eynollah.py +++ b/qurator/eynollah/eynollah.py @@ -546,7 +546,7 @@ class Eynollah: if img.shape[1] < img_width_model: img = resize_image(img, img.shape[0], img_width_model) - self.logger.info("Patch size: %sx%s", img_height_model, img_width_model) + self.logger.debug("Patch size: %sx%s", img_height_model, img_width_model) margin = int(marginal_of_patch_percent * img_height_model) width_mid = img_width_model - 2 * margin height_mid = img_height_model - 2 * margin From a33a1995cb880d190b62cd9593dac3dd43d33deb Mon Sep 17 00:00:00 2001 From: Clemens Neudecker <952378+cneud@users.noreply.github.com> Date: Fri, 8 Apr 2022 16:31:20 +0200 Subject: [PATCH 08/34] Update README.md --- README.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/README.md b/README.md index b20289d..9cb5d60 100644 --- a/README.md +++ b/README.md @@ -121,7 +121,7 @@ Some heuristic methods are also employed to further improve the model prediction
click to expand/collapse
-TODO +Coming soon
From 568391ec4ad0af096f5f2168d448331d5b9622f2 Mon Sep 17 00:00:00 2001 From: Clemens Neudecker <952378+cneud@users.noreply.github.com> Date: Tue, 26 Apr 2022 10:54:20 +0200 Subject: [PATCH 09/34] require model command line option (fix #59) (#73) --- qurator/eynollah/cli.py | 1 + 1 file changed, 1 insertion(+) diff --git a/qurator/eynollah/cli.py b/qurator/eynollah/cli.py index f343918..e419411 100644 --- a/qurator/eynollah/cli.py +++ b/qurator/eynollah/cli.py @@ -24,6 +24,7 @@ from qurator.eynollah.eynollah import Eynollah "-m", help="directory of models", type=click.Path(exists=True, file_okay=False), + required=True, ) @click.option( "--save_images", From ecf117ca9596e570dfb5155abbd8966b1e82a486 Mon Sep 17 00:00:00 2001 From: cneud Date: Tue, 26 Apr 2022 11:50:20 +0200 Subject: [PATCH 10/34] adapt to tf1.compat session mode in tf2 --- qurator/eynollah/eynollah.py | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-) diff --git a/qurator/eynollah/eynollah.py b/qurator/eynollah/eynollah.py index f8d7d09..784f07f 100644 --- a/qurator/eynollah/eynollah.py +++ b/qurator/eynollah/eynollah.py @@ -20,10 +20,10 @@ import numpy as np os.environ["TF_CPP_MIN_LOG_LEVEL"] = "3" stderr = sys.stderr sys.stderr = open(os.devnull, "w") -from keras import backend as K -from keras.models import load_model -sys.stderr = stderr import tensorflow as tf +from tensorflow.python.keras import backend as K +from tensorflow.keras.models load_model +sys.stderr = stderr tf.get_logger().setLevel("ERROR") warnings.filterwarnings("ignore") from scipy.signal import find_peaks From 8c11b2253dfad142aa896defc70813f222a713b7 Mon Sep 17 00:00:00 2001 From: cneud Date: Tue, 26 Apr 2022 11:51:22 +0200 Subject: [PATCH 11/34] update requirements (use tf2) --- requirements.txt | 3 +-- 1 file changed, 1 insertion(+), 2 deletions(-) diff --git a/requirements.txt b/requirements.txt index 8520780..6f2ea48 100644 --- a/requirements.txt +++ b/requirements.txt @@ -1,8 +1,7 @@ # ocrd includes opencv, numpy, shapely, click ocrd >= 2.23.3 -keras >= 2.3.1, < 2.4 scikit-learn >= 0.23.2 -tensorflow-gpu >= 1.15, < 2 +tensorflow-gpu >= 2.4.0 imutils >= 0.5.3 matplotlib setuptools >= 50 From 934bbd589267d08ff1386315868b752a631cbd42 Mon Sep 17 00:00:00 2001 From: cneud Date: Tue, 26 Apr 2022 12:04:27 +0200 Subject: [PATCH 12/34] cleanup --- qurator/eynollah/eynollah.py | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/qurator/eynollah/eynollah.py b/qurator/eynollah/eynollah.py index 784f07f..ff3ceac 100644 --- a/qurator/eynollah/eynollah.py +++ b/qurator/eynollah/eynollah.py @@ -22,7 +22,7 @@ stderr = sys.stderr sys.stderr = open(os.devnull, "w") import tensorflow as tf from tensorflow.python.keras import backend as K -from tensorflow.keras.models load_model +from tensorflow.keras.models import load_model sys.stderr = stderr tf.get_logger().setLevel("ERROR") warnings.filterwarnings("ignore") From 34a061782c0c5bcb193e3621933a4c3020b6718a Mon Sep 17 00:00:00 2001 From: Robert Sachunsky <38561704+bertsky@users.noreply.github.com> Date: Tue, 3 May 2022 23:19:01 +0200 Subject: [PATCH 13/34] depend on tensorflow instead of tensorflow-gpu (#76) --- requirements.txt | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/requirements.txt b/requirements.txt index 6f2ea48..0180d01 100644 --- a/requirements.txt +++ b/requirements.txt @@ -1,7 +1,7 @@ # ocrd includes opencv, numpy, shapely, click ocrd >= 2.23.3 scikit-learn >= 0.23.2 -tensorflow-gpu >= 2.4.0 +tensorflow >= 2.4.0 imutils >= 0.5.3 matplotlib setuptools >= 50 From 00be99d29b735700ab5f51359d7e8f47e0cdee53 Mon Sep 17 00:00:00 2001 From: Clemens Neudecker <952378+cneud@users.noreply.github.com> Date: Tue, 17 May 2022 12:01:25 +0200 Subject: [PATCH 14/34] add short section on supported Python, TF and CUDA versions --- README.md | 6 +++++- 1 file changed, 5 insertions(+), 1 deletion(-) diff --git a/README.md b/README.md index 9cb5d60..766946a 100644 --- a/README.md +++ b/README.md @@ -12,7 +12,11 @@ Alternatively, you can also use `make` with these targets: `make install` or -`make install-dev` for editable installation +`make install-dev` for editable installation + +The current version of Eynollah runs on Python `>=3.6` with Tensorflow `>=2.4`. + +In order to use a GPU for inference, the CUDA toolkit version 10.x needs to be installed. ### Models From 8d5079c909b662eda0b4acf5ae2908455f0ff939 Mon Sep 17 00:00:00 2001 From: vahid Date: Fri, 22 Jul 2022 15:43:19 +0200 Subject: [PATCH 15/34] issue #77 is resolved on main branch --- qurator/eynollah/utils/__init__.py | 19 ++++++++++++++----- 1 file changed, 14 insertions(+), 5 deletions(-) diff --git a/qurator/eynollah/utils/__init__.py b/qurator/eynollah/utils/__init__.py index 2533455..128c5d3 100644 --- a/qurator/eynollah/utils/__init__.py +++ b/qurator/eynollah/utils/__init__.py @@ -294,7 +294,7 @@ def return_x_start_end_mothers_childs_and_type_of_reading_order(x_min_hor_some,x #print(args_to_be_unified,'args_to_be_unified') - return reading_orther_type,x_start_returned, x_end_returned ,y_sep_returned,y_diff_returned,y_lines_without_mother,x_start_without_mother,x_end_without_mother,there_is_sep_with_child,y_lines_with_child_without_mother,x_start_with_child_without_mother,x_end_with_child_without_mother + return reading_orther_type,x_start_returned, x_end_returned ,y_sep_returned,y_diff_returned,y_lines_without_mother,x_start_without_mother,x_end_without_mother,there_is_sep_with_child,y_lines_with_child_without_mother,x_start_with_child_without_mother,x_end_with_child_without_mother,new_main_sep_y def crop_image_inside_box(box, img_org_copy): image_box = img_org_copy[box[1] : box[1] + box[3], box[0] : box[0] + box[2]] return image_box, [box[1], box[1] + box[3], box[0], box[0] + box[2]] @@ -1695,7 +1695,7 @@ def return_boxes_of_images_by_order_of_reading_new(splitter_y_new, regions_witho peaks_neg_tot_tables.append(peaks_neg_tot) - reading_order_type,x_starting,x_ending,y_type_2,y_diff_type_2,y_lines_without_mother,x_start_without_mother,x_end_without_mother,there_is_sep_with_child,y_lines_with_child_without_mother,x_start_with_child_without_mother,x_end_with_child_without_mother=return_x_start_end_mothers_childs_and_type_of_reading_order(x_min_hor_some,x_max_hor_some,cy_hor_some,peaks_neg_tot,cy_hor_diff) + reading_order_type,x_starting,x_ending,y_type_2,y_diff_type_2,y_lines_without_mother,x_start_without_mother,x_end_without_mother,there_is_sep_with_child,y_lines_with_child_without_mother,x_start_with_child_without_mother,x_end_with_child_without_mother,new_main_sep_y=return_x_start_end_mothers_childs_and_type_of_reading_order(x_min_hor_some,x_max_hor_some,cy_hor_some,peaks_neg_tot,cy_hor_diff) @@ -2164,9 +2164,18 @@ def return_boxes_of_images_by_order_of_reading_new(splitter_y_new, regions_witho ##y_lines_by_order.append(int(splitter_y_new[i])) ##x_start_by_order.append(0) - y_type_2.append(int(splitter_y_new[i])) - x_starting.append(x_starting[0]) - x_ending.append(x_ending[0]) + #y_type_2.append(int(splitter_y_new[i])) + #x_starting.append(x_starting[0]) + #x_ending.append(x_ending[0]) + + if len(new_main_sep_y)>0: + y_type_2.append(int(splitter_y_new[i])) + x_starting.append(0) + x_ending.append(len(peaks_neg_tot)-1) + else: + y_type_2.append(int(splitter_y_new[i])) + x_starting.append(x_starting[0]) + x_ending.append(x_ending[0]) y_type_2=np.array(y_type_2) From 98529d63250f81a780b960b34e6f59da937a3959 Mon Sep 17 00:00:00 2001 From: emresvd <109899306+emresvd@users.noreply.github.com> Date: Tue, 7 Feb 2023 13:36:16 +0300 Subject: [PATCH 16/34] Update README.md --- README.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/README.md b/README.md index 766946a..82c93d7 100644 --- a/README.md +++ b/README.md @@ -6,7 +6,7 @@ ## Installation `pip install .` or -`pip install . -e` for editable installation +`pip install -e .` for editable installation Alternatively, you can also use `make` with these targets: From ac69136e8f9e5d8117279d3acfe4829b89ce69d2 Mon Sep 17 00:00:00 2001 From: Clemens Neudecker <952378+cneud@users.noreply.github.com> Date: Thu, 9 Feb 2023 20:24:19 +0100 Subject: [PATCH 17/34] Update config.yml (#89) * Update config.yml * Update config.yml --- .circleci/config.yml | 10 +++++----- 1 file changed, 5 insertions(+), 5 deletions(-) diff --git a/.circleci/config.yml b/.circleci/config.yml index 72b2c5a..1a11d46 100644 --- a/.circleci/config.yml +++ b/.circleci/config.yml @@ -2,9 +2,9 @@ version: 2 jobs: - build-python36: + build-python37: docker: - - image: python:3.6 + - image: python:3.7 steps: - checkout - restore_cache: @@ -23,6 +23,6 @@ workflows: version: 2 build: jobs: - - build-python36 - #- build-python37 - #- build-python38 # no tensorflow for python 3.8 + #- build-python36 + - build-python37 + #- build-python38 From 79e897d3b2877d6002448b1e9d75e331636b078b Mon Sep 17 00:00:00 2001 From: Robert Sachunsky Date: Fri, 10 Feb 2023 00:56:52 +0000 Subject: [PATCH 18/34] try loading as TF SavedModel instead of HDF5 --- qurator/eynollah/eynollah.py | 3 +++ 1 file changed, 3 insertions(+) diff --git a/qurator/eynollah/eynollah.py b/qurator/eynollah/eynollah.py index ff3ceac..d6f70c3 100644 --- a/qurator/eynollah/eynollah.py +++ b/qurator/eynollah/eynollah.py @@ -515,6 +515,9 @@ class Eynollah: gpu_options = tf.compat.v1.GPUOptions(allow_growth=True) #gpu_options = tf.compat.v1.GPUOptions(per_process_gpu_memory_fraction=7.7, allow_growth=True) session = tf.compat.v1.Session(config=tf.compat.v1.ConfigProto(gpu_options=gpu_options)) + if model_dir.endswith('.h5') and Path(model_dir[:-3]).exists(): + # prefer SavedModel over HDF5 format if it exists + model_dir = model_dir[:-3] model = load_model(model_dir, compile=False) return model, session From ab4bb7cd7b76d4d04d3dd2273b398153058e7cc4 Mon Sep 17 00:00:00 2001 From: Robert Sachunsky Date: Sat, 11 Feb 2023 11:58:40 +0000 Subject: [PATCH 19/34] silentium! --- qurator/eynollah/eynollah.py | 13 ++++++++----- 1 file changed, 8 insertions(+), 5 deletions(-) diff --git a/qurator/eynollah/eynollah.py b/qurator/eynollah/eynollah.py index d6f70c3..22c45ad 100644 --- a/qurator/eynollah/eynollah.py +++ b/qurator/eynollah/eynollah.py @@ -220,7 +220,8 @@ class Eynollah: index_y_d = img_h - img_height_model img_patch = img[index_y_d:index_y_u, index_x_d:index_x_u, :] - label_p_pred = model_enhancement.predict(img_patch.reshape(1, img_patch.shape[0], img_patch.shape[1], img_patch.shape[2])) + label_p_pred = model_enhancement.predict(img_patch.reshape(1, img_patch.shape[0], img_patch.shape[1], img_patch.shape[2]), + verbose=0) seg = label_p_pred[0, :, :, :] seg = seg * 255 @@ -355,7 +356,7 @@ class Eynollah: img_in[0, :, :, 1] = img_1ch[:, :] img_in[0, :, :, 2] = img_1ch[:, :] - label_p_pred = model_num_classifier.predict(img_in) + label_p_pred = model_num_classifier.predict(img_in, verbose=0) num_col = np.argmax(label_p_pred[0]) + 1 self.logger.info("Found %s columns (%s)", num_col, label_p_pred) @@ -428,7 +429,7 @@ class Eynollah: - label_p_pred = model_num_classifier.predict(img_in) + label_p_pred = model_num_classifier.predict(img_in, verbose=0) num_col = np.argmax(label_p_pred[0]) + 1 self.logger.info("Found %s columns (%s)", num_col, label_p_pred) session_col_classifier.close() @@ -534,7 +535,8 @@ class Eynollah: img = img / float(255.0) img = resize_image(img, img_height_model, img_width_model) - label_p_pred = model.predict(img.reshape(1, img.shape[0], img.shape[1], img.shape[2])) + label_p_pred = model.predict(img.reshape(1, img.shape[0], img.shape[1], img.shape[2]), + verbose=0) seg = np.argmax(label_p_pred, axis=3)[0] seg_color = np.repeat(seg[:, :, np.newaxis], 3, axis=2) @@ -586,7 +588,8 @@ class Eynollah: index_y_d = img_h - img_height_model img_patch = img[index_y_d:index_y_u, index_x_d:index_x_u, :] - label_p_pred = model.predict(img_patch.reshape(1, img_patch.shape[0], img_patch.shape[1], img_patch.shape[2])) + label_p_pred = model.predict(img_patch.reshape(1, img_patch.shape[0], img_patch.shape[1], img_patch.shape[2]), + verbose=0) seg = np.argmax(label_p_pred, axis=3)[0] seg_color = np.repeat(seg[:, :, np.newaxis], 3, axis=2) From 2d9ccac35416afe24553147c429035c94cc2bf24 Mon Sep 17 00:00:00 2001 From: Robert Sachunsky Date: Sat, 11 Feb 2023 12:04:16 +0000 Subject: [PATCH 20/34] contours: simplify --- qurator/eynollah/eynollah.py | 20 ++++++++++---------- 1 file changed, 10 insertions(+), 10 deletions(-) diff --git a/qurator/eynollah/eynollah.py b/qurator/eynollah/eynollah.py index 22c45ad..301f750 100644 --- a/qurator/eynollah/eynollah.py +++ b/qurator/eynollah/eynollah.py @@ -2359,12 +2359,12 @@ class Eynollah: contours_only_text_parent = return_parent_contours(contours_only_text, hir_on_text) if len(contours_only_text_parent) > 0: - areas_cnt_text = np.array([cv2.contourArea(contours_only_text_parent[j]) for j in range(len(contours_only_text_parent))]) + areas_cnt_text = np.array([cv2.contourArea(c) for c in contours_only_text_parent]) areas_cnt_text = areas_cnt_text / float(text_only.shape[0] * text_only.shape[1]) #self.logger.info('areas_cnt_text %s', areas_cnt_text) contours_biggest = contours_only_text_parent[np.argmax(areas_cnt_text)] - contours_only_text_parent = [contours_only_text_parent[jz] for jz in range(len(contours_only_text_parent)) if areas_cnt_text[jz] > min_con_area] - areas_cnt_text_parent = [areas_cnt_text[jz] for jz in range(len(areas_cnt_text)) if areas_cnt_text[jz] > min_con_area] + contours_only_text_parent = [c for jz, c in enumerate(contours_only_text_parent) if areas_cnt_text[jz] > min_con_area] + areas_cnt_text_parent = [area for area in areas_cnt_text if area > min_con_area] index_con_parents = np.argsort(areas_cnt_text_parent) contours_only_text_parent = list(np.array(contours_only_text_parent)[index_con_parents]) @@ -2376,14 +2376,14 @@ class Eynollah: contours_only_text_d, hir_on_text_d = return_contours_of_image(text_only_d) contours_only_text_parent_d = return_parent_contours(contours_only_text_d, hir_on_text_d) - areas_cnt_text_d = np.array([cv2.contourArea(contours_only_text_parent_d[j]) for j in range(len(contours_only_text_parent_d))]) + areas_cnt_text_d = np.array([cv2.contourArea(c) for c in contours_only_text_parent_d]) areas_cnt_text_d = areas_cnt_text_d / float(text_only_d.shape[0] * text_only_d.shape[1]) if len(areas_cnt_text_d)>0: contours_biggest_d = contours_only_text_parent_d[np.argmax(areas_cnt_text_d)] - index_con_parents_d=np.argsort(areas_cnt_text_d) - contours_only_text_parent_d=list(np.array(contours_only_text_parent_d)[index_con_parents_d] ) - areas_cnt_text_d=list(np.array(areas_cnt_text_d)[index_con_parents_d] ) + index_con_parents_d = np.argsort(areas_cnt_text_d) + contours_only_text_parent_d = list(np.array(contours_only_text_parent_d)[index_con_parents_d]) + areas_cnt_text_d = list(np.array(areas_cnt_text_d)[index_con_parents_d]) cx_bigest_d_big, cy_biggest_d_big, _, _, _, _, _ = find_new_features_of_contours([contours_biggest_d]) cx_bigest_d, cy_biggest_d, _, _, _, _, _ = find_new_features_of_contours(contours_only_text_parent_d) @@ -2438,12 +2438,12 @@ class Eynollah: contours_only_text_parent = return_parent_contours(contours_only_text, hir_on_text) if len(contours_only_text_parent) > 0: - areas_cnt_text = np.array([cv2.contourArea(contours_only_text_parent[j]) for j in range(len(contours_only_text_parent))]) + areas_cnt_text = np.array([cv2.contourArea(c) for c in contours_only_text_parent]) areas_cnt_text = areas_cnt_text / float(text_only.shape[0] * text_only.shape[1]) contours_biggest = contours_only_text_parent[np.argmax(areas_cnt_text)] - contours_only_text_parent = [contours_only_text_parent[jz] for jz in range(len(contours_only_text_parent)) if areas_cnt_text[jz] > min_con_area] - areas_cnt_text_parent = [areas_cnt_text[jz] for jz in range(len(areas_cnt_text)) if areas_cnt_text[jz] > min_con_area] + contours_only_text_parent = [c for jz, c in enumerate(contours_only_text_parent) if areas_cnt_text[jz] > min_con_area] + areas_cnt_text_parent = [area for area in areas_cnt_text if area > min_con_area] index_con_parents = np.argsort(areas_cnt_text_parent) contours_only_text_parent = list(np.array(contours_only_text_parent)[index_con_parents]) From 13bc2378d952f1ef7637480304d5383a45af789d Mon Sep 17 00:00:00 2001 From: Clemens Neudecker <952378+cneud@users.noreply.github.com> Date: Thu, 16 Feb 2023 15:33:44 +0100 Subject: [PATCH 21/34] Update config.yml (#90) * Update config.yml enable CI for Python 3.8 * Update test-eynollah.yml Use 3.7 for actions --- .circleci/config.yml | 19 ++++++++++++++++++- .github/workflows/test-eynollah.yml | 2 +- 2 files changed, 19 insertions(+), 2 deletions(-) diff --git a/.circleci/config.yml b/.circleci/config.yml index 1a11d46..23eb724 100644 --- a/.circleci/config.yml +++ b/.circleci/config.yml @@ -19,10 +19,27 @@ jobs: - run: make install - run: make smoke-test + build-python38: + docker: + - image: python:3.8 + steps: + - checkout + - restore_cache: + keys: + - model-cache + - run: make models + - save_cache: + key: model-cache + paths: + models_eynollah.tar.gz + models_eynollah + - run: make install + - run: make smoke-test + workflows: version: 2 build: jobs: #- build-python36 - build-python37 - #- build-python38 + - build-python38 diff --git a/.github/workflows/test-eynollah.yml b/.github/workflows/test-eynollah.yml index 1afd2a6..9e8d7b1 100644 --- a/.github/workflows/test-eynollah.yml +++ b/.github/workflows/test-eynollah.yml @@ -11,7 +11,7 @@ jobs: runs-on: ubuntu-latest strategy: matrix: - python-version: ['3.6'] # '3.7' + python-version: ['3.7'] # '3.8' steps: - uses: actions/checkout@v2 From a56988a35a528aba7cefd85e1a83257f598a5085 Mon Sep 17 00:00:00 2001 From: Robert Sachunsky Date: Sat, 11 Feb 2023 12:05:49 +0000 Subject: [PATCH 22/34] contours: numpy now needs dtype=object --- qurator/eynollah/eynollah.py | 10 +++++----- qurator/eynollah/utils/contour.py | 2 +- 2 files changed, 6 insertions(+), 6 deletions(-) diff --git a/qurator/eynollah/eynollah.py b/qurator/eynollah/eynollah.py index 301f750..c854b46 100644 --- a/qurator/eynollah/eynollah.py +++ b/qurator/eynollah/eynollah.py @@ -2367,7 +2367,7 @@ class Eynollah: areas_cnt_text_parent = [area for area in areas_cnt_text if area > min_con_area] index_con_parents = np.argsort(areas_cnt_text_parent) - contours_only_text_parent = list(np.array(contours_only_text_parent)[index_con_parents]) + contours_only_text_parent = list(np.array(contours_only_text_parent, dtype=object)[index_con_parents]) areas_cnt_text_parent = list(np.array(areas_cnt_text_parent)[index_con_parents]) cx_bigest_big, cy_biggest_big, _, _, _, _, _ = find_new_features_of_contours([contours_biggest]) @@ -2382,7 +2382,7 @@ class Eynollah: if len(areas_cnt_text_d)>0: contours_biggest_d = contours_only_text_parent_d[np.argmax(areas_cnt_text_d)] index_con_parents_d = np.argsort(areas_cnt_text_d) - contours_only_text_parent_d = list(np.array(contours_only_text_parent_d)[index_con_parents_d]) + contours_only_text_parent_d = list(np.array(contours_only_text_parent_d, dtype=object)[index_con_parents_d]) areas_cnt_text_d = list(np.array(areas_cnt_text_d)[index_con_parents_d]) cx_bigest_d_big, cy_biggest_d_big, _, _, _, _, _ = find_new_features_of_contours([contours_biggest_d]) @@ -2446,7 +2446,7 @@ class Eynollah: areas_cnt_text_parent = [area for area in areas_cnt_text if area > min_con_area] index_con_parents = np.argsort(areas_cnt_text_parent) - contours_only_text_parent = list(np.array(contours_only_text_parent)[index_con_parents]) + contours_only_text_parent = list(np.array(contours_only_text_parent, dtype=object)[index_con_parents]) areas_cnt_text_parent = list(np.array(areas_cnt_text_parent)[index_con_parents]) cx_bigest_big, cy_biggest_big, _, _, _, _, _ = find_new_features_of_contours([contours_biggest]) @@ -2473,7 +2473,7 @@ class Eynollah: K.clear_session() if self.full_layout: if np.abs(slope_deskew) >= SLOPE_THRESHOLD: - contours_only_text_parent_d_ordered = list(np.array(contours_only_text_parent_d_ordered)[index_by_text_par_con]) + contours_only_text_parent_d_ordered = list(np.array(contours_only_text_parent_d_ordered, dtype=object)[index_by_text_par_con]) text_regions_p, contours_only_text_parent, contours_only_text_parent_h, all_box_coord, all_box_coord_h, all_found_texline_polygons, all_found_texline_polygons_h, slopes, slopes_h, contours_only_text_parent_d_ordered, contours_only_text_parent_h_d_ordered = check_any_text_region_in_model_one_is_main_or_header(text_regions_p, regions_fully, contours_only_text_parent, all_box_coord, all_found_texline_polygons, slopes, contours_only_text_parent_d_ordered) else: contours_only_text_parent_d_ordered = None @@ -2566,7 +2566,7 @@ class Eynollah: if np.abs(slope_deskew) < SLOPE_THRESHOLD: order_text_new, id_of_texts_tot = self.do_order_of_regions(contours_only_text_parent, contours_only_text_parent_h, boxes, textline_mask_tot) else: - contours_only_text_parent_d_ordered = list(np.array(contours_only_text_parent_d_ordered)[index_by_text_par_con]) + contours_only_text_parent_d_ordered = list(np.array(contours_only_text_parent_d_ordered, dtype=object)[index_by_text_par_con]) order_text_new, id_of_texts_tot = self.do_order_of_regions(contours_only_text_parent_d_ordered, contours_only_text_parent_h, boxes_d, textline_mask_tot_d) pcgts = self.writer.build_pagexml_no_full_layout(txt_con_org, page_coord, order_text_new, id_of_texts_tot, all_found_texline_polygons, all_box_coord, polygons_of_images, polygons_of_marginals, all_found_texline_polygons_marginals, all_box_coord_marginals, slopes, slopes_marginals, cont_page, polygons_lines_xml, contours_tables) self.logger.info("Job done in %.1fs", time.time() - t0) diff --git a/qurator/eynollah/utils/contour.py b/qurator/eynollah/utils/contour.py index 6b81391..d8a8af9 100644 --- a/qurator/eynollah/utils/contour.py +++ b/qurator/eynollah/utils/contour.py @@ -19,7 +19,7 @@ def contours_in_same_horizon(cy_main_hor): list_h.append(i) if len(list_h) > 1: all_args.append(list(set(list_h))) - return np.unique(all_args) + return np.unique(np.array(all_args, dtype=object)) def find_contours_mean_y_diff(contours_main): M_main = [cv2.moments(contours_main[j]) for j in range(len(contours_main))] From 7345f6bf678f36cf3a51576b0fa94df0919925d7 Mon Sep 17 00:00:00 2001 From: Robert Sachunsky Date: Sat, 11 Feb 2023 17:11:34 +0000 Subject: [PATCH 23/34] remove TF1 session and GC controls, avoid repeating load_model --- qurator/eynollah/eynollah.py | 154 ++++++----------------------------- 1 file changed, 26 insertions(+), 128 deletions(-) diff --git a/qurator/eynollah/eynollah.py b/qurator/eynollah/eynollah.py index c854b46..170b5a7 100644 --- a/qurator/eynollah/eynollah.py +++ b/qurator/eynollah/eynollah.py @@ -145,6 +145,8 @@ class Eynollah: self.model_region_dir_p_ens = dir_models + "/model_ensemble_s.h5" self.model_textline_dir = dir_models + "/model_textline_newspapers.h5" self.model_tables = dir_models + "/model_tables_ens_mixed_new_2.h5" + + self.models = {} def _cache_images(self, image_filename=None, image_pil=None): ret = {} @@ -255,11 +257,6 @@ class Eynollah: prediction_true[index_y_d + margin : index_y_u - margin, index_x_d + margin : index_x_u - margin, :] = seg prediction_true = prediction_true.astype(int) - session_enhancement.close() - del model_enhancement - del session_enhancement - gc.collect() - return prediction_true def calculate_width_height_by_columns(self, img, num_col, width_early, label_p_pred): @@ -361,16 +358,6 @@ class Eynollah: self.logger.info("Found %s columns (%s)", num_col, label_p_pred) - session_col_classifier.close() - - del model_num_classifier - del session_col_classifier - - K.clear_session() - gc.collect() - - - img_new, _ = self.calculate_width_height_by_columns(img, num_col, width_early, label_p_pred) if img_new.shape[1] > img.shape[1]: @@ -394,11 +381,6 @@ class Eynollah: prediction_bin =np.repeat(prediction_bin[:, :, np.newaxis], 3, axis=2) - session_bin.close() - del model_bin - del session_bin - gc.collect() - prediction_bin = prediction_bin.astype(np.uint8) img= np.copy(prediction_bin) img_bin = np.copy(prediction_bin) @@ -428,12 +410,9 @@ class Eynollah: img_in[0, :, :, 2] = img_1ch[:, :] - label_p_pred = model_num_classifier.predict(img_in, verbose=0) num_col = np.argmax(label_p_pred[0]) + 1 - self.logger.info("Found %s columns (%s)", num_col, label_p_pred) - session_col_classifier.close() - K.clear_session() + self.logger.info("Found %d columns (%s)", num_col, np.around(label_p_pred, decimals=5)) if dpi < DPI_THRESHOLD: img_new, num_column_is_classified = self.calculate_width_height_by_columns(img, num_col, width_early, label_p_pred) @@ -444,9 +423,6 @@ class Eynollah: image_res = np.copy(img) is_image_enhanced = False - session_col_classifier.close() - - self.logger.debug("exit resize_and_enhance_image_with_column_classifier") return is_image_enhanced, img, image_res, num_col, num_column_is_classified, img_bin @@ -513,15 +489,24 @@ class Eynollah: def start_new_session_and_model(self, model_dir): self.logger.debug("enter start_new_session_and_model (model_dir=%s)", model_dir) - gpu_options = tf.compat.v1.GPUOptions(allow_growth=True) + #gpu_options = tf.compat.v1.GPUOptions(allow_growth=True) #gpu_options = tf.compat.v1.GPUOptions(per_process_gpu_memory_fraction=7.7, allow_growth=True) - session = tf.compat.v1.Session(config=tf.compat.v1.ConfigProto(gpu_options=gpu_options)) + #session = tf.compat.v1.Session(config=tf.compat.v1.ConfigProto(gpu_options=gpu_options)) + physical_devices = tf.config.list_physical_devices('GPU') + try: + tf.config.experimental.set_memory_growth(physical_devices[0], True) + except: + self.logger.warning("no GPU device available") if model_dir.endswith('.h5') and Path(model_dir[:-3]).exists(): # prefer SavedModel over HDF5 format if it exists model_dir = model_dir[:-3] - model = load_model(model_dir, compile=False) + if model_dir in self.models: + model = self.models[model_dir] + else: + model = load_model(model_dir, compile=False) + self.models[model_dir] = model - return model, session + return model, None def do_prediction(self, patches, img, model, marginal_of_patch_percent=0.1): self.logger.debug("enter do_prediction") @@ -640,8 +625,6 @@ class Eynollah: prediction_true[index_y_d + margin : index_y_u - margin, index_x_d + margin : index_x_u - margin, :] = seg_color prediction_true = prediction_true.astype(np.uint8) - del model - gc.collect() return prediction_true def early_page_for_num_of_column_classification(self,img_bin): @@ -668,19 +651,15 @@ class Eynollah: else: box = [0, 0, img.shape[1], img.shape[0]] croped_page, page_coord = crop_image_inside_box(box, img) - session_page.close() - del model_page - del session_page - gc.collect() - K.clear_session() self.logger.debug("exit early_page_for_num_of_column_classification") return croped_page, page_coord def extract_page(self): self.logger.debug("enter extract_page") cont_page = [] - model_page, session_page = self.start_new_session_and_model(self.model_page_dir) img = cv2.GaussianBlur(self.image, (5, 5), 0) + + model_page, session_page = self.start_new_session_and_model(self.model_page_dir) img_page_prediction = self.do_prediction(False, img, model_page) imgray = cv2.cvtColor(img_page_prediction, cv2.COLOR_BGR2GRAY) _, thresh = cv2.threshold(imgray, 0, 255, 0) @@ -707,11 +686,6 @@ class Eynollah: box = [0, 0, img.shape[1], img.shape[0]] croped_page, page_coord = crop_image_inside_box(box, self.image) cont_page.append(np.array([[page_coord[2], page_coord[0]], [page_coord[3], page_coord[0]], [page_coord[3], page_coord[1]], [page_coord[2], page_coord[1]]])) - session_page.close() - del model_page - del session_page - gc.collect() - K.clear_session() self.logger.debug("exit extract_page") return croped_page, page_coord, cont_page @@ -807,11 +781,6 @@ class Eynollah: prediction_regions = self.do_prediction(patches, img, model_region, marginal_of_patch_percent) prediction_regions = resize_image(prediction_regions, img_height_h, img_width_h) - session_region.close() - del model_region - del session_region - gc.collect() - self.logger.debug("exit extract_text_regions") return prediction_regions, prediction_regions2 @@ -1112,9 +1081,6 @@ class Eynollah: prediction_textline_longshot = self.do_prediction(False, img, model_textline) prediction_textline_longshot_true_size = resize_image(prediction_textline_longshot, img_h, img_w) - session_textline.close() - - return prediction_textline[:, :, 0], prediction_textline_longshot_true_size[:, :, 0] def do_work_of_slopes(self, q, poly, box_sub, boxes_per_process, textline_mask_tot, contours_per_process): @@ -1191,11 +1157,6 @@ class Eynollah: ##plt.show() prediction_regions_org=prediction_regions_org[:,:,0] prediction_regions_org[(prediction_regions_org[:,:]==1) & (mask_zeros_y[:,:]==1)]=0 - - session_region.close() - del model_region - del session_region - gc.collect() model_region, session_region = self.start_new_session_and_model(self.model_region_dir_p2) img = resize_image(img_org, int(img_org.shape[0]), int(img_org.shape[1])) @@ -1203,11 +1164,6 @@ class Eynollah: prediction_regions_org2=resize_image(prediction_regions_org2, img_height_h, img_width_h ) - session_region.close() - del model_region - del session_region - gc.collect() - mask_zeros2 = (prediction_regions_org2[:,:,0] == 0) mask_lines2 = (prediction_regions_org2[:,:,0] == 3) text_sume_early = (prediction_regions_org[:,:] == 1).sum() @@ -1247,12 +1203,6 @@ class Eynollah: prediction_bin =np.repeat(prediction_bin[:, :, np.newaxis], 3, axis=2) - session_bin.close() - del model_bin - del session_bin - gc.collect() - - model_region, session_region = self.start_new_session_and_model(self.model_region_dir_p_ens) ratio_y=1 @@ -1266,11 +1216,6 @@ class Eynollah: prediction_regions_org=prediction_regions_org[:,:,0] mask_lines_only=(prediction_regions_org[:,:]==3)*1 - session_region.close() - del model_region - del session_region - gc.collect() - mask_texts_only=(prediction_regions_org[:,:]==1)*1 mask_images_only=(prediction_regions_org[:,:]==2)*1 @@ -1289,20 +1234,12 @@ class Eynollah: text_regions_p_true=cv2.fillPoly(text_regions_p_true,pts=polygons_of_only_texts, color=(1,1,1)) - - - K.clear_session() return text_regions_p_true, erosion_hurts, polygons_lines_xml except: if self.input_binary: prediction_bin = np.copy(img_org) else: - session_region.close() - del model_region - del session_region - gc.collect() - model_bin, session_bin = self.start_new_session_and_model(self.model_dir_of_binarization) prediction_bin = self.do_prediction(True, img_org, model_bin) prediction_bin = resize_image(prediction_bin, img_height_h, img_width_h ) @@ -1314,15 +1251,6 @@ class Eynollah: prediction_bin =np.repeat(prediction_bin[:, :, np.newaxis], 3, axis=2) - - - session_bin.close() - del model_bin - del session_bin - gc.collect() - - - model_region, session_region = self.start_new_session_and_model(self.model_region_dir_p_ens) ratio_y=1 ratio_x=1 @@ -1335,11 +1263,6 @@ class Eynollah: prediction_regions_org=prediction_regions_org[:,:,0] #mask_lines_only=(prediction_regions_org[:,:]==3)*1 - session_region.close() - del model_region - del session_region - gc.collect() - #img = resize_image(img_org, int(img_org.shape[0]*1), int(img_org.shape[1]*1)) #prediction_regions_org = self.do_prediction(True, img, model_region) @@ -1349,11 +1272,6 @@ class Eynollah: #prediction_regions_org = prediction_regions_org[:,:,0] #prediction_regions_org[(prediction_regions_org[:,:] == 1) & (mask_zeros_y[:,:] == 1)]=0 - #session_region.close() - #del model_region - #del session_region - #gc.collect() - @@ -1381,7 +1299,7 @@ class Eynollah: text_regions_p_true = cv2.fillPoly(text_regions_p_true, pts = polygons_of_only_texts, color=(1,1,1)) erosion_hurts = True - K.clear_session() + return text_regions_p_true, erosion_hurts, polygons_lines_xml def do_order_of_regions_full_layout(self, contours_only_text_parent, contours_only_text_parent_h, boxes, textline_mask_tot): @@ -1873,9 +1791,8 @@ class Eynollah: img_new =np.ones((height_new,width_new,img.shape[2])).astype(float)*0 img_new[h_start:h_start+img.shape[0] ,w_start: w_start+img.shape[1], : ] =img[:,:,:] - + prediction_ext = self.do_prediction(patches, img_new, model_region) - pre_updown = self.do_prediction(patches, cv2.flip(img_new[:,:,:], -1), model_region) pre_updown = cv2.flip(pre_updown, -1) @@ -1896,9 +1813,8 @@ class Eynollah: img_new =np.ones((height_new,width_new,img.shape[2])).astype(float)*0 img_new[h_start:h_start+img.shape[0] ,w_start: w_start+img.shape[1], : ] =img[:,:,:] - + prediction_ext = self.do_prediction(patches, img_new, model_region) - pre_updown = self.do_prediction(patches, cv2.flip(img_new[:,:,:], -1), model_region) pre_updown = cv2.flip(pre_updown, -1) @@ -1911,12 +1827,10 @@ class Eynollah: else: prediction_table = np.zeros(img.shape) img_w_half = int(img.shape[1]/2.) - + pre1 = self.do_prediction(patches, img[:,0:img_w_half,:], model_region) pre2 = self.do_prediction(patches, img[:,img_w_half:,:], model_region) - pre_full = self.do_prediction(patches, img[:,:,:], model_region) - pre_updown = self.do_prediction(patches, cv2.flip(img[:,:,:], -1), model_region) pre_updown = cv2.flip(pre_updown, -1) @@ -1939,11 +1853,6 @@ class Eynollah: prediction_table_erode = cv2.erode(prediction_table[:,:,0], KERNEL, iterations=20) prediction_table_erode = cv2.dilate(prediction_table_erode, KERNEL, iterations=20) - del model_region - del session_region - gc.collect() - - return prediction_table_erode.astype(np.int16) def run_graphics_and_columns(self, text_regions_p_1, num_col_classifier, num_column_is_classified, erosion_hurts): @@ -1995,7 +1904,7 @@ class Eynollah: self.logger.info("Resizing and enhancing image...") is_image_enhanced, img_org, img_res, num_col_classifier, num_column_is_classified, img_bin = self.resize_and_enhance_image_with_column_classifier() self.logger.info("Image was %senhanced.", '' if is_image_enhanced else 'not ') - K.clear_session() + scale = 1 if is_image_enhanced: if self.allow_enhancement: @@ -2019,7 +1928,7 @@ class Eynollah: scaler_h_textline = 1 # 1.2#1.2 scaler_w_textline = 1 # 0.9#1 textline_mask_tot_ea, _ = self.textline_contours(image_page, True, scaler_h_textline, scaler_w_textline) - K.clear_session() + if self.plotter: self.plotter.save_plot_of_textlines(textline_mask_tot_ea, image_page) return textline_mask_tot_ea @@ -2032,7 +1941,7 @@ class Eynollah: if self.plotter: self.plotter.save_deskewed_image(slope_deskew) - self.logger.info("slope_deskew: %s", slope_deskew) + self.logger.info("slope_deskew: %.2f°", slope_deskew) return slope_deskew, slope_first def run_marginals(self, image_page, textline_mask_tot_ea, mask_images, mask_lines, num_col_classifier, slope_deskew, text_regions_p_1, table_prediction): @@ -2081,7 +1990,6 @@ class Eynollah: if np.abs(slope_deskew) >= SLOPE_THRESHOLD: _, _, matrix_of_lines_ch_d, splitter_y_new_d, _ = find_number_of_columns_in_document(np.repeat(text_regions_p_1_n[:, :, np.newaxis], 3, axis=2), num_col_classifier, self.tables, pixel_lines) - K.clear_session() self.logger.info("num_col_classifier: %s", num_col_classifier) @@ -2147,7 +2055,6 @@ class Eynollah: contours_tables = return_contours_of_interested_region(text_regions_p, pixel_img, min_area_mar) - K.clear_session() self.logger.debug('exit run_boxes_no_full_layout') return polygons_of_images, img_revised_tab, text_regions_p_1_n, textline_mask_tot_d, regions_without_separators_d, boxes, boxes_d, polygons_of_marginals, contours_tables @@ -2178,8 +2085,6 @@ class Eynollah: if np.abs(slope_deskew) >= SLOPE_THRESHOLD: num_col_d, peaks_neg_fin_d, matrix_of_lines_ch_d, splitter_y_new_d, seperators_closeup_n_d = find_number_of_columns_in_document(np.repeat(text_regions_p_1_n[:, :, np.newaxis], 3, axis=2),num_col_classifier, self.tables, pixel_lines) - K.clear_session() - gc.collect() if num_col_classifier>=3: if np.abs(slope_deskew) < SLOPE_THRESHOLD: @@ -2246,21 +2151,18 @@ class Eynollah: text_regions_p[:, :][text_regions_p[:, :] == 3] = 6 text_regions_p[:, :][text_regions_p[:, :] == 4] = 8 - K.clear_session() image_page = image_page.astype(np.uint8) regions_fully, regions_fully_only_drop = self.extract_text_regions(image_page, True, cols=num_col_classifier) text_regions_p[:,:][regions_fully[:,:,0]==6]=6 regions_fully_only_drop = put_drop_out_from_only_drop_model(regions_fully_only_drop, text_regions_p) regions_fully[:, :, 0][regions_fully_only_drop[:, :, 0] == 4] = 4 - K.clear_session() # plt.imshow(regions_fully[:,:,0]) # plt.show() regions_fully = putt_bb_of_drop_capitals_of_model_in_patches_in_layout(regions_fully) # plt.imshow(regions_fully[:,:,0]) # plt.show() - K.clear_session() regions_fully_np, _ = self.extract_text_regions(image_page, False, cols=num_col_classifier) # plt.imshow(regions_fully_np[:,:,0]) # plt.show() @@ -2271,7 +2173,6 @@ class Eynollah: # plt.imshow(regions_fully_np[:,:,0]) # plt.show() - K.clear_session() # plt.imshow(regions_fully[:,:,0]) # plt.show() regions_fully = boosting_headers_by_longshot_region_segmentation(regions_fully, regions_fully_np, img_only_regions) @@ -2297,7 +2198,6 @@ class Eynollah: if not self.tables: regions_without_separators = (text_regions_p[:, :] == 1) * 1 - K.clear_session() img_revised_tab = np.copy(text_regions_p[:, :]) polygons_of_images = return_contours_of_interested_region(img_revised_tab, 5) self.logger.debug('exit run_boxes_full_layout') @@ -2470,7 +2370,7 @@ class Eynollah: all_found_texline_polygons = small_textlines_to_parent_adherence2(all_found_texline_polygons, textline_mask_tot_ea, num_col_classifier) all_found_texline_polygons_marginals, boxes_marginals, _, polygons_of_marginals, all_box_coord_marginals, _, slopes_marginals = self.get_slopes_and_deskew_new_curved(polygons_of_marginals, polygons_of_marginals, cv2.erode(textline_mask_tot_ea, kernel=KERNEL, iterations=1), image_page_rotated, boxes_marginals, text_only, num_col_classifier, scale_param, slope_deskew) all_found_texline_polygons_marginals = small_textlines_to_parent_adherence2(all_found_texline_polygons_marginals, textline_mask_tot_ea, num_col_classifier) - K.clear_session() + if self.full_layout: if np.abs(slope_deskew) >= SLOPE_THRESHOLD: contours_only_text_parent_d_ordered = list(np.array(contours_only_text_parent_d_ordered, dtype=object)[index_by_text_par_con]) @@ -2483,8 +2383,6 @@ class Eynollah: self.plotter.save_plot_of_layout(text_regions_p, image_page) self.plotter.save_plot_of_layout_all(text_regions_p, image_page) - K.clear_session() - pixel_img = 4 polygons_of_drop_capitals = return_contours_of_interested_region_by_min_size(text_regions_p, pixel_img) all_found_texline_polygons = adhere_drop_capital_region_into_corresponding_textline(text_regions_p, polygons_of_drop_capitals, contours_only_text_parent, contours_only_text_parent_h, all_box_coord, all_box_coord_h, all_found_texline_polygons, all_found_texline_polygons_h, kernel=KERNEL, curved_line=self.curved_line) From 5c26bdf402c8f82e185f9a3704e11d806a12546f Mon Sep 17 00:00:00 2001 From: Robert Sachunsky Date: Thu, 16 Feb 2023 13:56:47 +0000 Subject: [PATCH 24/34] ocrd-tool: add model archive to resmgr resources --- qurator/eynollah/ocrd-tool.json | 12 +++++++++++- 1 file changed, 11 insertions(+), 1 deletion(-) diff --git a/qurator/eynollah/ocrd-tool.json b/qurator/eynollah/ocrd-tool.json index 220f2ea..9121868 100644 --- a/qurator/eynollah/ocrd-tool.json +++ b/qurator/eynollah/ocrd-tool.json @@ -44,7 +44,17 @@ "default": false, "description": "ignore the special role of headings during reading order detection" } - } + }, + "resources": [ + { + "description": "models for eynollah (TensorFlow format)", + "url": "https://ocr-d.kba.cloud/2021-04-25.SavedModel.tar.gz", + "name": "default", + "size": 1483106598, + "type": "archive", + "path_in_archive": "default" + } + ] } } } From 23f0c0b40a8005383d2fe72f18bf0016668f81ce Mon Sep 17 00:00:00 2001 From: Robert Sachunsky Date: Thu, 16 Feb 2023 18:50:57 +0000 Subject: [PATCH 25/34] ocrd-tool: replace by persistent model URL --- qurator/eynollah/ocrd-tool.json | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/qurator/eynollah/ocrd-tool.json b/qurator/eynollah/ocrd-tool.json index 9121868..ab97a4f 100644 --- a/qurator/eynollah/ocrd-tool.json +++ b/qurator/eynollah/ocrd-tool.json @@ -48,7 +48,7 @@ "resources": [ { "description": "models for eynollah (TensorFlow format)", - "url": "https://ocr-d.kba.cloud/2021-04-25.SavedModel.tar.gz", + "url": "https://qurator-data.de/eynollah/2021-04-25/SavedModel.tar.gz", "name": "default", "size": 1483106598, "type": "archive", From 318ea6acca1e4a76bdd0a26c554fd57f6db0ffa1 Mon Sep 17 00:00:00 2001 From: Robert Sachunsky Date: Thu, 16 Feb 2023 14:40:02 +0000 Subject: [PATCH 26/34] OCR-D wrapper: expose tables param --- qurator/eynollah/ocrd-tool.json | 5 +++++ qurator/eynollah/processor.py | 1 + 2 files changed, 6 insertions(+) diff --git a/qurator/eynollah/ocrd-tool.json b/qurator/eynollah/ocrd-tool.json index ab97a4f..b9b4020 100644 --- a/qurator/eynollah/ocrd-tool.json +++ b/qurator/eynollah/ocrd-tool.json @@ -29,6 +29,11 @@ "default": true, "description": "Try to detect all element subtypes, including drop-caps and headings" }, + "tables": { + "type": "boolean", + "default": false, + "description": "Try to detect table regions" + }, "curved_line": { "type": "boolean", "default": false, diff --git a/qurator/eynollah/processor.py b/qurator/eynollah/processor.py index 41b12ae..ccec456 100644 --- a/qurator/eynollah/processor.py +++ b/qurator/eynollah/processor.py @@ -50,6 +50,7 @@ class EynollahProcessor(Processor): 'full_layout': self.parameter['full_layout'], 'allow_scaling': self.parameter['allow_scaling'], 'headers_off': self.parameter['headers_off'], + 'tables': self.parameter['tables'], 'override_dpi': self.parameter['dpi'], 'logger': LOG, 'pcgts': pcgts, From 875e4fe32bbe1d87bbfcf776ae2f3b32c9b61f6f Mon Sep 17 00:00:00 2001 From: Robert Sachunsky Date: Thu, 16 Feb 2023 14:45:37 +0000 Subject: [PATCH 27/34] log number of detected regions --- qurator/eynollah/eynollah.py | 10 ++++++++++ 1 file changed, 10 insertions(+) diff --git a/qurator/eynollah/eynollah.py b/qurator/eynollah/eynollah.py index 170b5a7..264bb62 100644 --- a/qurator/eynollah/eynollah.py +++ b/qurator/eynollah/eynollah.py @@ -388,6 +388,7 @@ class Eynollah: img = self.imread() img_bin = None + t1 = time.time() _, page_coord = self.early_page_for_num_of_column_classification(img_bin) model_num_classifier, session_col_classifier = self.start_new_session_and_model(self.model_dir_of_col_classifier) @@ -413,6 +414,7 @@ class Eynollah: label_p_pred = model_num_classifier.predict(img_in, verbose=0) num_col = np.argmax(label_p_pred[0]) + 1 self.logger.info("Found %d columns (%s)", num_col, np.around(label_p_pred, decimals=5)) + self.logger.info("detecting columns took %.1fs", time.time() - t1) if dpi < DPI_THRESHOLD: img_new, num_column_is_classified = self.calculate_width_height_by_columns(img, num_col, width_early, label_p_pred) @@ -2356,6 +2358,14 @@ class Eynollah: # self.logger.debug('len(contours_only_text_parent) %s', len(contours_only_text_parent_d)) else: pass + + self.logger.info("Found %d text regions", len(contours_only_text_parent)) + self.logger.info("Found %d margin regions", len(polygons_of_marginals)) + self.logger.info("Found %d image regions", len(polygons_of_images)) + self.logger.info("Found %d separator lines", len(polygons_lines_xml)) + if self.tables: + self.logger.info("Found %d tables", len(contours_tables)) + txt_con_org = get_textregion_contours_in_org_image(contours_only_text_parent, self.image, slope_first) boxes_text, _ = get_text_region_boxes_by_given_contours(contours_only_text_parent) boxes_marginals, _ = get_text_region_boxes_by_given_contours(polygons_of_marginals) From 31a2ec8fe68bfbc1ec0515694ec834f205e1d415 Mon Sep 17 00:00:00 2001 From: Konstantin Baierer Date: Wed, 22 Mar 2023 14:18:57 +0100 Subject: [PATCH 28/34] :memo: changelog --- CHANGELOG.md | 11 +++++++++++ 1 file changed, 11 insertions(+) diff --git a/CHANGELOG.md b/CHANGELOG.md index e8815d6..3446404 100644 --- a/CHANGELOG.md +++ b/CHANGELOG.md @@ -5,6 +5,17 @@ Versioned according to [Semantic Versioning](http://semver.org/). ## Unreleased +Fixed: + + * Do not produce spurious `TextEquiv`, #68 + * Less spammy logging, #64, #65, #71 + +Changed: + + * Upgrade to tensorflow 2.4.0, #74 + * Improved README + * CI: test for python 3.7+, #90 + ## [0.0.11] - 2022-02-02 Fixed: From 71d0ec8dfeed71fe8f18b684f231f1af52ed56e4 Mon Sep 17 00:00:00 2001 From: Konstantin Baierer Date: Wed, 22 Mar 2023 14:21:53 +0100 Subject: [PATCH 29/34] :package: v0.1.0 --- CHANGELOG.md | 3 +++ qurator/eynollah/ocrd-tool.json | 2 +- 2 files changed, 4 insertions(+), 1 deletion(-) diff --git a/CHANGELOG.md b/CHANGELOG.md index 3446404..240753f 100644 --- a/CHANGELOG.md +++ b/CHANGELOG.md @@ -5,6 +5,8 @@ Versioned according to [Semantic Versioning](http://semver.org/). ## Unreleased +## [0.1.0] - 2023-03-22 + Fixed: * Do not produce spurious `TextEquiv`, #68 @@ -83,6 +85,7 @@ Fixed: Initial release +[0.1.0]: ../../compare/v0.1.0...v0.0.11 [0.0.11]: ../../compare/v0.0.11...v0.0.10 [0.0.10]: ../../compare/v0.0.10...v0.0.9 [0.0.9]: ../../compare/v0.0.9...v0.0.8 diff --git a/qurator/eynollah/ocrd-tool.json b/qurator/eynollah/ocrd-tool.json index 220f2ea..03bc52a 100644 --- a/qurator/eynollah/ocrd-tool.json +++ b/qurator/eynollah/ocrd-tool.json @@ -1,5 +1,5 @@ { - "version": "0.0.11", + "version": "0.1.0", "git_url": "https://github.com/qurator-spk/eynollah", "tools": { "ocrd-eynollah-segment": { From 7cd07dd550f1c1f4884aa2cb45339c44db878037 Mon Sep 17 00:00:00 2001 From: Konstantin Baierer Date: Wed, 22 Mar 2023 16:19:00 +0100 Subject: [PATCH 30/34] use PEP420 style qurator namespace --- qurator/__init__.py | 1 - setup.py | 1 - 2 files changed, 2 deletions(-) diff --git a/qurator/__init__.py b/qurator/__init__.py index 5284146..e69de29 100644 --- a/qurator/__init__.py +++ b/qurator/__init__.py @@ -1 +0,0 @@ -__import__("pkg_resources").declare_namespace(__name__) diff --git a/setup.py b/setup.py index 9abf158..c78ee3f 100644 --- a/setup.py +++ b/setup.py @@ -13,7 +13,6 @@ setup( author='Vahid Rezanezhad', url='https://github.com/qurator-spk/eynollah', license='Apache License 2.0', - namespace_packages=['qurator'], packages=find_packages(exclude=['tests']), install_requires=install_requires, package_data={ From e167e0863d64928c9cad30375d6c5b9bad476fa2 Mon Sep 17 00:00:00 2001 From: Konstantin Baierer Date: Fri, 24 Mar 2023 14:16:10 +0100 Subject: [PATCH 31/34] :memo: changelog --- CHANGELOG.md | 9 +++++++++ 1 file changed, 9 insertions(+) diff --git a/CHANGELOG.md b/CHANGELOG.md index 240753f..eb44b0c 100644 --- a/CHANGELOG.md +++ b/CHANGELOG.md @@ -5,6 +5,15 @@ Versioned according to [Semantic Versioning](http://semver.org/). ## Unreleased +Changed: + + * Convert default model from HDFS to TF SavedModel, #91 + +Added: + + * parmeter `tables` to toggle table detectino, #91 + * default model described in ocrd-tool.json, #91 + ## [0.1.0] - 2023-03-22 Fixed: From ea792d1e4ac4a722770b82dc91e71f84d5beb212 Mon Sep 17 00:00:00 2001 From: Konstantin Baierer Date: Fri, 24 Mar 2023 14:16:55 +0100 Subject: [PATCH 32/34] :package: v0.2.0 --- CHANGELOG.md | 3 +++ qurator/eynollah/ocrd-tool.json | 2 +- 2 files changed, 4 insertions(+), 1 deletion(-) diff --git a/CHANGELOG.md b/CHANGELOG.md index eb44b0c..9f6ceff 100644 --- a/CHANGELOG.md +++ b/CHANGELOG.md @@ -5,6 +5,8 @@ Versioned according to [Semantic Versioning](http://semver.org/). ## Unreleased +## [0.2.0] - 2023-03-24 + Changed: * Convert default model from HDFS to TF SavedModel, #91 @@ -94,6 +96,7 @@ Fixed: Initial release +[0.2.0]: ../../compare/v0.2.0...v0.1.0 [0.1.0]: ../../compare/v0.1.0...v0.0.11 [0.0.11]: ../../compare/v0.0.11...v0.0.10 [0.0.10]: ../../compare/v0.0.10...v0.0.9 diff --git a/qurator/eynollah/ocrd-tool.json b/qurator/eynollah/ocrd-tool.json index e6a06e5..fc9ee72 100644 --- a/qurator/eynollah/ocrd-tool.json +++ b/qurator/eynollah/ocrd-tool.json @@ -1,5 +1,5 @@ { - "version": "0.1.0", + "version": "0.2.0", "git_url": "https://github.com/qurator-spk/eynollah", "tools": { "ocrd-eynollah-segment": { From 14fc04042841556f5e59f9e3ca185e8c1a004d1d Mon Sep 17 00:00:00 2001 From: Konstantin Baierer Date: Sun, 2 Apr 2023 14:07:51 +0200 Subject: [PATCH 33/34] use find_namespace_packages in setup.py --- setup.py | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/setup.py b/setup.py index c78ee3f..807eae7 100644 --- a/setup.py +++ b/setup.py @@ -1,4 +1,4 @@ -from setuptools import setup, find_packages +from setuptools import find_namespace_packages, find_packages, setup from json import load install_requires = open('requirements.txt').read().split('\n') From 8fe35671237ee86c3ec94db18df018dbc972aab3 Mon Sep 17 00:00:00 2001 From: Robert Sachunsky <38561704+bertsky@users.noreply.github.com> Date: Thu, 13 Apr 2023 19:02:41 +0200 Subject: [PATCH 34/34] set_memory_growth to all GPU devices alike --- qurator/eynollah/eynollah.py | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/qurator/eynollah/eynollah.py b/qurator/eynollah/eynollah.py index 264bb62..9312c42 100644 --- a/qurator/eynollah/eynollah.py +++ b/qurator/eynollah/eynollah.py @@ -496,7 +496,8 @@ class Eynollah: #session = tf.compat.v1.Session(config=tf.compat.v1.ConfigProto(gpu_options=gpu_options)) physical_devices = tf.config.list_physical_devices('GPU') try: - tf.config.experimental.set_memory_growth(physical_devices[0], True) + for device in physical_devices: + tf.config.experimental.set_memory_growth(device, True) except: self.logger.warning("no GPU device available") if model_dir.endswith('.h5') and Path(model_dir[:-3]).exists():