Commit graph

276 commits

Author SHA1 Message Date
Robert Sachunsky
235539a350 filter_contours_without_textline_inside: avoid removing from identical lists twice 2025-09-29 17:47:51 +02:00
Robert Sachunsky
11e143afee polygon2contour: avoid overflow 2025-09-29 17:47:51 +02:00
Robert Sachunsky
7a9e8256ee increase dilatation: textregions/lines (5→6), seplines (0→1) 2025-09-29 17:47:51 +02:00
Robert Sachunsky
f3faa29528 refactor shapely converisons into contour2polygon / polygon2contour, also handle heterogeneous geometries 2025-09-29 17:47:51 +02:00
Robert Sachunsky
0650274ffa move dilate_*_contours to .utils.contour, rename dilate_textregions_contours_textline_version → dilate_textline_contours 2025-09-29 17:47:47 +02:00
Robert Sachunsky
a433c73628 filter_contours_area_of_image*: also ensure validity here 2025-09-29 17:46:50 +02:00
Robert Sachunsky
17bcf1af71 rename *lines_xml → *seplines for clarity 2025-09-29 17:46:50 +02:00
Robert Sachunsky
e730725da3 check_any_text_region_in_model_one_is_main_or_header_light: return original instead of resampled contours 2025-09-29 17:46:50 +02:00
Robert Sachunsky
7b51fd6624 avoid creating invalid polygons via rounding 2025-09-29 17:46:50 +02:00
Robert Sachunsky
41cc38c51a get_textregion_contours_in_org_image_light: no back rotation, drop slope_first (always 0) 2025-09-29 17:46:48 +02:00
Robert Sachunsky
afba70c920 separate_lines/do_work_of_slopes: skip if crop is empty 2025-09-29 17:44:39 +02:00
Robert Sachunsky
66b2bce8b9 return_boxes_of_images_by_order_of_reading_new: log any exceptions 2025-09-29 17:44:36 +02:00
Robert Sachunsky
b48c41e68f return_boxes_of_images_by_order_of_reading_new: simplify, avoid changing dtype during np.append 2025-09-29 17:42:53 +02:00
Robert Sachunsky
09ece86f0d dilate_textregions_contours: simplify (via shapely's Polygon.buffer()), ensure validity 2025-09-29 17:42:53 +02:00
kba
6ea6a62801 📝 v0.5.0 2025-09-26 16:23:46 +02:00
Robert Sachunsky
480daa4c7c test_run: make ocr -doit work (add truetype file) 2025-09-25 22:28:15 +02:00
kba
f37d80c188 Merge branch 'adapt-ocrd' of https://github.com/qurator-spk/eynollah into adapt-ocrd 2025-09-25 21:39:55 +02:00
kba
57ee1cdc72 Merge remote-tracking branch 'bertsky/mbro_dead_code-plus-fixes-plus-tests' into adapt-ocrd 2025-09-25 21:39:36 +02:00
kba
9303ded11f ocrd-tool.json: use models_layout instead of eynollah_layouts for consistency 2025-09-25 21:12:52 +02:00
Robert Sachunsky
7c79902835 enhancement/mbreorder: make all path options kwargs to run() instead of attributes 2025-09-25 20:51:02 +02:00
kba
11de8a025d Adapt ocrd-eynollah-segment for release 2025-09-25 20:11:48 +02:00
kba
5e15c4f248 Merge remote-tracking branch 'bertsky/mbro_dead_code-plus-fixes-plus-tests' into prepare-release-v0.5.0 2025-09-25 20:05:03 +02:00
Robert Sachunsky
2d14d57e4f ocr: minimal debug logging 2025-09-25 19:52:50 +02:00
Robert Sachunsky
1dcc7b5795 ocr CLI: make --model vs --model_name xor 2025-09-25 16:38:43 +02:00
Robert Sachunsky
5b1e0c1327 layout/ocr: make all path options kwargs to run() instead of attributes; ocr: drop redundant prediction_with_both_of_rgb_and_bin in favour of just bool(dir_in_bin) 2025-09-25 16:26:31 +02:00
Robert Sachunsky
ef1304a764 CLIs: reorder options, explain -i vs -di 2025-09-25 16:11:39 +02:00
Robert Sachunsky
df5448cdcd CLIs: add required=True where missing 2025-09-25 16:08:40 +02:00
b-vr103
369ef573f9 get textlines sorted in textregions - detection of vertical and horizontal regions improved 2025-09-25 12:51:02 +02:00
Robert Sachunsky
9967510327 mbreorder: filter by .xml suffix in dir-in mode 2025-09-25 01:15:37 +02:00
Robert Sachunsky
b094a6b77f mbreorder: avoid spaces in logger name 2025-09-25 01:15:37 +02:00
Robert Sachunsky
d6cdb69acb binarize/enhance/layout/ocr ls_imgs: use the same file name suffix filter for dir-in mode 2025-09-25 01:15:37 +02:00
Robert Sachunsky
96a0d22496 mbreorder CLI: change options to mimic other commands 2025-09-25 01:15:37 +02:00
Robert Sachunsky
93f7588bfa binarizer CLI: add --log-level 2025-09-24 23:08:50 +02:00
Robert Sachunsky
8a1e5a8950 enhancement / layout CLI: do not override logger name 2025-09-24 23:03:11 +02:00
Robert Sachunsky
960b11f51f machine-based-reading-order CLI: no foreign logger, add --log-level 2025-09-24 22:58:57 +02:00
kba
45b05c2316 Merge branch 'mbro_dead_code' into prepare-release-v0.5.0 2025-09-24 17:18:31 +02:00
vahidrezanezhad
80d50d4bf6 get textlines sorted in textregion - verticals 2025-09-24 17:17:27 +02:00
b-vr103
6d8641a518 get textlines sorted in textregion - verticals 2025-09-24 17:17:21 +02:00
vahidrezanezhad
6904a98182 get textlines inside textregion sorted debugging 2025-09-24 17:17:12 +02:00
vahidrezanezhad
ce13d8c5a3 get textlines inside textregion sorted 2025-09-24 17:16:47 +02:00
kba
8b30bdbae2 image_enhancer: use latest page extraction model 2025-09-24 16:39:31 +02:00
kba
c8ebe84697 image_enhancer: add missing models, remove dead code 2025-09-24 16:36:18 +02:00
kba
b75ca0d31f mb_ro_on_layout: remove copy-pasta code not actually used 2025-09-24 16:29:05 +02:00
Robert Sachunsky
5bd318e657 rm print statement (already log msg) 2025-09-24 12:14:32 +02:00
Robert Sachunsky
90f1d7aa47 rm summary msg (info already logged elsewhere) 2025-09-24 12:10:11 +02:00
Robert Sachunsky
7933b103f5 log modes only once (in run, not in run_single) 2025-09-24 12:09:30 +02:00
Robert Sachunsky
d0817f5744 fix typo 2025-09-24 12:08:50 +02:00
kba
9ead58b99a Merge remote-tracking branch 'michalbubula/add-feedback' into prepare-release-v0.5.0 2025-09-23 19:50:27 +02:00
kba
7bde99e866 Merge remote-tracking branch 'origin/updating_readme_for_eynollah_use_cases' into prepare-release-v0.5.0 2025-09-23 19:42:55 +02:00
kba
df8d93dbfa Merge branch 'main' into add-feedback 2025-09-23 19:20:20 +02:00