Commit Graph

111 Commits (8080bd823c59a7ff27de7a5ef4e92563eb91f88b)

Author SHA1 Message Date
kba 8080bd823c 📦 v0.4.0 1 month ago
vahidrezanezhad e2907f67e0 'from PIL.Image import Image' causes an error when using Image.new(), and since Image is already imported, this line can be safely commented out. 1 month ago
Robert Sachunsky 4339444e47 binarization CLI: fix option checks, simplify to asserts, fix dir_in mode 1 month ago
Robert Sachunsky 91a340f619 CLI: simplify option checks to asserts (also avoid stack trace) 1 month ago
Robert Sachunsky e0a7fde537 logger: fix type hint 1 month ago
Robert Sachunsky 108ce1f5a1 Merge remote-tracking branch 'origin/main' into v3-api-release-foreal
(bad-ass difficult diff diffing)
1 month ago
vahidrezanezhad 2e3a29f66b In light mode: To determine whether a main region is a header, I adjusted the ratio to achieve better results. 1 month ago
vahidrezanezhad 38a2d60fa2 Confidence value for textregions and in the case of not light version is set to zero. This is done to let the pipeline go through. It will be updated to return the correct value in upcomming commits 1 month ago
vahidrezanezhad 6b52da227c docorating eynollah with textregion confidence score #135 1 month ago
Robert Sachunsky 559d001eef another fix to avoid frequent warnings 1 month ago
Robert Sachunsky dd478279a4 CLI: also --overwrite in single-image mode 1 month ago
Robert Sachunsky 8159e6336a fix typo (preventing log messages) 1 month ago
Robert Sachunsky 2919538382 minor fixes to avoid frequent warnings 1 month ago
Robert Sachunsky dcf2ed5e22 run: also write out XML in single filename mode 1 month ago
Robert Sachunsky fe77171d45 run_single: reduce indentation 1 month ago
Robert Sachunsky 79003a083c CLI: ValueError instead of print+exit 1 month ago
Robert Sachunsky e17d34fafa factor run_single() out of run(), simplify kwargs 1 month ago
Robert Sachunsky 1a0a1cb00b remove session methods and redundant model loaders 1 month ago
Robert Sachunsky dd51f900b9 OCR-D: init Eynollah in 'setup', re-use instance for each page via non-public API 1 month ago
Robert Sachunsky ffeb4a343d Eynollah: remove useless 'pcgts' attr 1 month ago
vahidrezanezhad 91b2201b07 cnnrnn Ocr: width of input textline image can not be zero! 1 month ago
Robert Sachunsky 515b4023f6 sbb_binarize: fix missing reference 1 month ago
vahidrezanezhad 4de441eaaa OCR prediction is now enabled to integrate results from both RGB and binarized images or to be performed on each individually 1 month ago
vahidrezanezhad b1da0a3327 In OCR, the predicted text is now drawn on the image, and the results are saved in a specified directory. This makes it easier to review the predicted output 1 month ago
Robert Sachunsky c01609ff4e allow even more empty imports for optional dependencies 1 month ago
Robert Sachunsky 46618f4229 allow more empty imports for optional dependencies 1 month ago
Robert Sachunsky 4be89910a2 CLI: fix arg vs kwarg from merge 2 months ago
Robert Sachunsky 9d61acf173 simplify 2 months ago
Robert Sachunsky a1068ff2eb OCR-D: move sbb-binarize to ocrd-tool.json, update to v3 2 months ago
Robert Sachunsky c794d4d29f OCR-D: fix typo light_mode→light_version 2 months ago
Robert Sachunsky 4338259ca1 OCR-D: ensure page image gets replaced in result as well if not the original file 2 months ago
Robert Sachunsky 55969b0173 OCR-D: add docstring 2 months ago
Robert Sachunsky 6d02e90570 OCR-D: restrict max_workers=1 2 months ago
Robert Sachunsky efd3fa6775 allow empty imports for optional dependencies 2 months ago
Robert Sachunsky 238132e260 use 'image_filename' for pseudo-iteration outside 'dir_in' mode 2 months ago
Robert Sachunsky af4e2a4ffc do not require 'dir_out' outside 'dir_in' mode 2 months ago
Robert Sachunsky ea136e3ddd 'overwrite' check: only in 'dir_in' mode 2 months ago
Robert Sachunsky 1f4a17b60d Merge remote-tracking branch 'origin/machine_based_reading_order_integration' into v3-api 2 months ago
Robert Sachunsky edf924c2cb ocrd-tool: add dockerhub 2 months ago
vahidrezanezhad 9b04688ebc The rotate_image function has been updated. Additionally, the reading order is now correct in the case of the light version, provided that slope_deskew exceeds the slope_threshold. 2 months ago
vahidrezanezhad cf40f9ecc5 The rotate_image function produces the exact same rotation as Imutils. Therefore, there is no need to retain the remove-imutils-1 branch. 2 months ago
vahidrezanezhad f756b08c9b
Revert "replace usages of `imutils` with opencv equivalents" 2 months ago
vahidrezanezhad 52c605185a
Merge pull request #146 from qurator-spk/remove-imutils-1
replace usages of `imutils` with opencv equivalents
2 months ago
vahidrezanezhad 6f36c7177f For OCR, the splitting ratio of text lines is adjusted 2 months ago
cneud 181c0c584f bbox rotation with opencv 2 months ago
cneud eaff9e3537 Merge branch 'main' into remove-imutils-1 2 months ago
vahidrezanezhad 7df0427b04 In the context of OCR, if Page-XML files already contain text, the new predicted text will replace the existing text. 2 months ago
vahidrezanezhad 370d44a66b Slope deskew in the light version is set to zero because when the slope_deskew value exceeds the slope_threshold, the reading order becomes incorrect. This issue needs to be addressed. Additionally, the textlines order within text region in the light version was reversed, and this has been corrected. 2 months ago
vahidrezanezhad d3a4c06e7f This commit enables the export of cropped text line images along with their corresponding texts from a Page-XML file. These exported text line images and texts can be utilized for training a text line-based OCR model. 2 months ago
vahidrezanezhad c8b8529951 For the CNN-RNN OCR model, long text lines are split into two segments 2 months ago