vahidrezanezhad
|
c8b8529951
|
For the CNN-RNN OCR model, long text lines are split into two segments
|
2025-03-17 19:50:58 +01:00 |
|
vahidrezanezhad
|
aa72ca3006
|
Resolved an issue in the OCR-D framework where dir_out received a None value
|
2025-03-13 15:02:38 +01:00 |
|
vahidrezanezhad
|
a4f1f35125
|
Resolving test failure
|
2025-03-07 13:19:56 +01:00 |
|
kba
|
54040c1db4
|
Merge remote-tracking branch 'bertsky/machine_based_reading_order_integration_fixes' into machine_based_reading_order_integration
|
2025-03-06 15:48:52 +01:00 |
|
vahidrezanezhad
|
7110bd971f
|
resolved an error for light version in the case that slope_deskew is smaller than slope_threshold
|
2025-02-27 19:11:15 +01:00 |
|
vahidrezanezhad
|
25116a2c79
|
resolved 2 errors
|
2025-02-19 00:35:48 +01:00 |
|
vahidrezanezhad
|
33fda2f8be
|
changing cnn ocr model name
|
2024-12-26 22:45:40 +01:00 |
|
Robert Sachunsky
|
335aa273a1
|
simplify, wrap extremely long lines
|
2024-12-23 13:36:29 +00:00 |
|
Robert Sachunsky
|
cfc65128b1
|
reduce redundancy/indentation
|
2024-12-22 14:56:32 +00:00 |
|
Robert Sachunsky
|
01376af905
|
do_order_of_regions_with_model: simplify
|
2024-12-22 13:10:05 +00:00 |
|
vahidrezanezhad
|
92bfac4b41
|
Provide OCR as an option to process a directory of XML files, incorporating layout and text line coordinates.
|
2024-12-20 15:47:21 +01:00 |
|
vahidrezanezhad
|
fbeef79d50
|
adding scatter_nd inference
|
2024-12-16 01:11:54 +01:00 |
|
Robert Sachunsky
|
0ae28f7d3e
|
switch from stdlib to loky.ProcessPoolExecutor, ensure shutdown
|
2024-12-14 12:16:29 +00:00 |
|
vahidrezanezhad
|
f93c6c288d
|
function of patch-wise inference with scatter_nd is added
|
2024-12-14 02:50:17 +01:00 |
|
vahidrezanezhad
|
0e8c561618
|
debugging issues
|
2024-12-14 00:24:29 +01:00 |
|
Robert Sachunsky
|
e9c0d716f6
|
CI: install optional dependencies, too
|
2024-12-11 23:48:56 +00:00 |
|
Robert Sachunsky
|
dcaf796283
|
change polarity of orientation angle (PAGE schema required cw=positive)
|
2024-12-11 23:07:56 +00:00 |
|
Robert Sachunsky
|
b4b0890294
|
add option to overwrite output xml, but skip by default if file exists
|
2024-12-11 19:52:21 +00:00 |
|
Robert Sachunsky
|
b9ca7a6191
|
log num_cols-dependent resizing
|
2024-12-11 18:48:26 +00:00 |
|
Robert Sachunsky
|
9270ea4550
|
annotate region angles in PAGE
|
2024-12-11 18:48:26 +00:00 |
|
Robert Sachunsky
|
3b70b11ea6
|
avoid deskewing patches if binary-empty
|
2024-12-11 18:48:26 +00:00 |
|
Robert Sachunsky
|
7e9ee90e6e
|
switch from (ad-hoc) mp.Pool to (attribute) concurrent.futures.ProcessPoolExecutor
|
2024-12-11 18:48:26 +00:00 |
|
Robert Sachunsky
|
68456ea002
|
do_work_of_slopes_new*, do_back_rotation_and_get_cnt_back, do_work_of_contours_in_image: use mp.Pool, simplify
|
2024-12-11 18:48:26 +00:00 |
|
Robert Sachunsky
|
25e967397d
|
exit early if no text regions found (to avoid segfault)
|
2024-12-11 18:48:26 +00:00 |
|
Robert Sachunsky
|
21efea8711
|
no del on function argument
|
2024-12-11 18:48:26 +00:00 |
|
Robert Sachunsky
|
5e0c1da711
|
simplify
|
2024-12-11 00:18:58 +00:00 |
|
Robert Sachunsky
|
54cb15056b
|
do_image_rotation / return_deskew_slop: avoid code duplication, simplify via mp.Pool
|
2024-12-10 09:52:32 +00:00 |
|
Robert Sachunsky
|
6fe02df973
|
do_image_rotation: fix f93fa12 (do return results)
|
2024-12-09 16:35:31 +00:00 |
|
Robert Sachunsky
|
d68017037c
|
do_prediction: trigger GC to avoid CUDA OOM
|
2024-12-09 11:27:11 +00:00 |
|
Robert Sachunsky
|
ad748d0039
|
do_prediction: avoid code duplication
|
2024-12-09 10:55:41 +00:00 |
|
Robert Sachunsky
|
c3163caefd
|
avoid indentation
|
2024-12-05 14:28:17 +00:00 |
|
Robert Sachunsky
|
055463d23a
|
avoid indentation
|
2024-12-05 09:43:30 +00:00 |
|
Robert Sachunsky
|
aaea2ef463
|
simplify
|
2024-12-05 09:40:02 +00:00 |
|
Robert Sachunsky
|
3d88b207fc
|
run: log instead of print
|
2024-12-05 09:39:55 +00:00 |
|
Robert Sachunsky
|
a520bd1f77
|
wrap extremely long lines
|
2024-12-04 23:04:51 +00:00 |
|
Robert Sachunsky
|
cd4e426977
|
avoid indentation (skip_layout_and_reading_order)
|
2024-12-04 23:04:48 +00:00 |
|
Robert Sachunsky
|
5b82320707
|
avoid indentation
|
2024-12-04 22:09:32 +00:00 |
|
Robert Sachunsky
|
9f12fa241d
|
log-level: only set 'eynollah' logger level
|
2024-12-04 22:09:15 +00:00 |
|
Robert Sachunsky
|
14beb46224
|
simplify loading models w/o dir_in mode
|
2024-12-04 21:07:26 +00:00 |
|
Robert Sachunsky
|
329fac23f6
|
do not reload enhancement model in dir_in mode, simplify
|
2024-12-04 18:29:49 +00:00 |
|
Robert Sachunsky
|
3b9a29bc5c
|
simplify dir_in conditionals
|
2024-12-04 18:19:54 +00:00 |
|
Robert Sachunsky
|
7ae64f3717
|
RO model: do not reload when in dir_in mode
|
2024-12-04 16:18:35 +00:00 |
|
Robert Sachunsky
|
f765e2603b
|
move Torch to optional dependencies (to avoid clash with TF over CuDNN)
|
2024-12-04 15:57:13 +00:00 |
|
vahidrezanezhad
|
871d7bfc5a
|
fixed: machine based reading order cause tuple index out of range error if number of textregion is one.
|
2024-12-04 16:41:00 +01:00 |
|
vahidrezanezhad
|
6aad006f4c
|
filter textregions without textline
|
2024-12-02 12:43:57 +01:00 |
|
kba
|
1083d1c7fb
|
gha: try to free disk space
|
2024-11-25 19:32:48 +01:00 |
|
vahidrezanezhad
|
8014a9e416
|
Update Makefile
|
2024-11-22 19:47:06 +01:00 |
|
vahidrezanezhad
|
3000255a24
|
Update Makefile
|
2024-11-22 12:40:21 +01:00 |
|
vahidrezanezhad
|
1746920275
|
Update Makefile
|
2024-11-21 12:08:29 +01:00 |
|
vahidrezanezhad
|
b622494f34
|
new table detection model is integrated
|
2024-11-21 02:16:22 +01:00 |
|