Commit graph

682 commits

Author SHA1 Message Date
kba
869110f185 merge main 2025-01-20 14:45:27 +01:00
vahidrezanezhad
33fda2f8be changing cnn ocr model name 2024-12-26 22:45:40 +01:00
Robert Sachunsky
335aa273a1 simplify, wrap extremely long lines 2024-12-23 13:36:29 +00:00
Robert Sachunsky
cfc65128b1 reduce redundancy/indentation 2024-12-22 14:56:32 +00:00
Robert Sachunsky
01376af905 do_order_of_regions_with_model: simplify 2024-12-22 13:10:05 +00:00
vahidrezanezhad
92bfac4b41 Provide OCR as an option to process a directory of XML files, incorporating layout and text line coordinates. 2024-12-20 15:47:21 +01:00
vahidrezanezhad
fbeef79d50 adding scatter_nd inference 2024-12-16 01:11:54 +01:00
Robert Sachunsky
0ae28f7d3e switch from stdlib to loky.ProcessPoolExecutor, ensure shutdown 2024-12-14 12:16:29 +00:00
vahidrezanezhad
f93c6c288d function of patch-wise inference with scatter_nd is added 2024-12-14 02:50:17 +01:00
vahidrezanezhad
0e8c561618 debugging issues 2024-12-14 00:24:29 +01:00
Robert Sachunsky
e9c0d716f6 CI: install optional dependencies, too 2024-12-11 23:48:56 +00:00
Robert Sachunsky
dcaf796283 change polarity of orientation angle (PAGE schema required cw=positive) 2024-12-11 23:07:56 +00:00
Robert Sachunsky
b4b0890294 add option to overwrite output xml, but skip by default if file exists 2024-12-11 19:52:21 +00:00
Robert Sachunsky
b9ca7a6191 log num_cols-dependent resizing 2024-12-11 18:48:26 +00:00
Robert Sachunsky
9270ea4550 annotate region angles in PAGE 2024-12-11 18:48:26 +00:00
Robert Sachunsky
3b70b11ea6 avoid deskewing patches if binary-empty 2024-12-11 18:48:26 +00:00
Robert Sachunsky
7e9ee90e6e switch from (ad-hoc) mp.Pool to (attribute) concurrent.futures.ProcessPoolExecutor 2024-12-11 18:48:26 +00:00
Robert Sachunsky
68456ea002 do_work_of_slopes_new*, do_back_rotation_and_get_cnt_back, do_work_of_contours_in_image: use mp.Pool, simplify 2024-12-11 18:48:26 +00:00
Robert Sachunsky
25e967397d exit early if no text regions found (to avoid segfault) 2024-12-11 18:48:26 +00:00
Robert Sachunsky
21efea8711 no del on function argument 2024-12-11 18:48:26 +00:00
Robert Sachunsky
5e0c1da711 simplify 2024-12-11 00:18:58 +00:00
Robert Sachunsky
54cb15056b do_image_rotation / return_deskew_slop: avoid code duplication, simplify via mp.Pool 2024-12-10 09:52:32 +00:00
Robert Sachunsky
6fe02df973 do_image_rotation: fix f93fa12 (do return results) 2024-12-09 16:35:31 +00:00
Robert Sachunsky
d68017037c do_prediction: trigger GC to avoid CUDA OOM 2024-12-09 11:27:11 +00:00
Robert Sachunsky
ad748d0039 do_prediction: avoid code duplication 2024-12-09 10:55:41 +00:00
Robert Sachunsky
c3163caefd avoid indentation 2024-12-05 14:28:17 +00:00
Robert Sachunsky
055463d23a avoid indentation 2024-12-05 09:43:30 +00:00
Robert Sachunsky
aaea2ef463 simplify 2024-12-05 09:40:02 +00:00
Robert Sachunsky
3d88b207fc run: log instead of print 2024-12-05 09:39:55 +00:00
Robert Sachunsky
a520bd1f77 wrap extremely long lines 2024-12-04 23:04:51 +00:00
Robert Sachunsky
cd4e426977 avoid indentation (skip_layout_and_reading_order) 2024-12-04 23:04:48 +00:00
Robert Sachunsky
5b82320707 avoid indentation 2024-12-04 22:09:32 +00:00
Robert Sachunsky
9f12fa241d log-level: only set 'eynollah' logger level 2024-12-04 22:09:15 +00:00
Robert Sachunsky
14beb46224 simplify loading models w/o dir_in mode 2024-12-04 21:07:26 +00:00
Robert Sachunsky
329fac23f6 do not reload enhancement model in dir_in mode, simplify 2024-12-04 18:29:49 +00:00
Robert Sachunsky
3b9a29bc5c simplify dir_in conditionals 2024-12-04 18:19:54 +00:00
Robert Sachunsky
7ae64f3717 RO model: do not reload when in dir_in mode 2024-12-04 16:18:35 +00:00
Robert Sachunsky
f765e2603b move Torch to optional dependencies (to avoid clash with TF over CuDNN) 2024-12-04 15:57:13 +00:00
vahidrezanezhad
871d7bfc5a fixed: machine based reading order cause tuple index out of range error if number of textregion is one. 2024-12-04 16:41:00 +01:00
vahidrezanezhad
6aad006f4c filter textregions without textline 2024-12-02 12:43:57 +01:00
kba
1083d1c7fb gha: try to free disk space 2024-11-25 19:32:48 +01:00
vahidrezanezhad
8014a9e416
Update Makefile 2024-11-22 19:47:06 +01:00
vahidrezanezhad
3000255a24
Update Makefile 2024-11-22 12:40:21 +01:00
vahidrezanezhad
1746920275
Update Makefile 2024-11-21 12:08:29 +01:00
vahidrezanezhad
b622494f34 new table detection model is integrated 2024-11-21 02:16:22 +01:00
vahidrezanezhad
d9f79c3404 fixing IndexError by reading order detection 2024-11-18 10:15:19 +01:00
vahidrezanezhad
5fa8ca46a4 updating requirements 2024-11-14 17:35:00 +01:00
vahidrezanezhad
ce5b611296 tests are passed - new models by the way should be uploaded 2024-11-14 17:18:07 +01:00
vahidrezanezhad
f43c49c508 textlines of drop capitals are connected to corresponding textline if possible otherwise they are inserted in corresponding textregion 2024-11-13 11:53:56 +01:00
vahidrezanezhad
22b0b07a73 drop capital and marginals extraction is updated 2024-11-11 19:01:40 +01:00