kba
|
96eb1c11e6
|
Merge remote-tracking branch 'bertsky/loky-with-shm-for-175-rebuilt' into prepare-v0.6.0
|
2025-10-01 20:27:56 +02:00 |
|
vahidrezanezhad
|
5725e4fd1f
|
-Continue processing when num_col is None but textregions exist. -Convert marginal-only to main body if no main body is present. -Reset deskew angle to 0 when text region density (textregion area to page area) < 0.3 and angle > 45°.
|
2025-10-01 15:58:03 +02:00 |
|
Robert Sachunsky
|
3aa7ad04fa
|
📝 update changelog
|
2025-09-30 23:14:52 +02:00 |
|
Robert Sachunsky
|
f0de1adabf
|
rm loky dependency
|
2025-09-30 23:12:18 +02:00 |
|
Robert Sachunsky
|
7daec392b9
|
Dockerfile: fix up CUDA installation for mixed TF/Torch
|
2025-09-30 22:10:45 +02:00 |
|
Robert Sachunsky
|
ad129ed46c
|
CI: remove OS from model cache keys
|
2025-09-30 22:05:53 +02:00 |
|
Robert Sachunsky
|
c86e59f481
|
CI: update model key, split up cache restore/save
|
2025-09-30 22:03:46 +02:00 |
|
Robert Sachunsky
|
a3d8197930
|
makefile: update model URL
|
2025-09-30 21:50:21 +02:00 |
|
Robert Sachunsky
|
61b20cc83d
|
tests: switch from subtests to parametrize, use --isolate everywhere to free CUDA memory in between
|
2025-09-30 19:20:35 +02:00 |
|
Robert Sachunsky
|
375e0263d4
|
CNN-RNN OCR model: switch to 20250930 version (compatible with TF 2.12 on CPU as well)
|
2025-09-30 19:16:50 +02:00 |
|
Robert Sachunsky
|
b21051db21
|
ProcessPoolExecutor: shutdown during del() instead of atexit()
|
2025-09-30 19:16:00 +02:00 |
|
Robert Sachunsky
|
08c8c26028
|
indent extremely long lines
|
2025-09-30 03:52:19 +02:00 |
|
Robert Sachunsky
|
f857ee7b51
|
simplify
|
2025-09-30 02:26:00 +02:00 |
|
Robert Sachunsky
|
c0137c29ad
|
try to fix the failed outsourcing of utils_ocr
|
2025-09-30 02:23:43 +02:00 |
|
Robert Sachunsky
|
13f85b0d5c
|
Merge branch 'main' into loky-with-shm-for-175-rebuilt
|
2025-09-30 02:07:20 +02:00 |
|
Robert Sachunsky
|
758602403e
|
replace loky with concurrent.futures.ProcessPoolExecutor (faster)
|
2025-09-29 17:48:22 +02:00 |
|
Robert Sachunsky
|
0366707136
|
get_smallest_skew: do not pass logger
|
2025-09-29 17:48:22 +02:00 |
|
Robert Sachunsky
|
b94c96fcbb
|
find_num_col: exit early if empty (avoiding exceptions)
|
2025-09-29 17:48:22 +02:00 |
|
Robert Sachunsky
|
04c3d7dd1b
|
get_smallest_skew: avoid shm if no ProcessPoolExecutor is passed
|
2025-09-29 17:48:22 +02:00 |
|
Robert Sachunsky
|
0662ece536
|
do_work_of_slopes*: use shm also in non-light mode(s)
|
2025-09-29 17:48:22 +02:00 |
|
Robert Sachunsky
|
31f240c3b8
|
do_image_rotation, do_work_of_slopes_new_curved: pass arrays via shared memory
|
2025-09-29 17:48:22 +02:00 |
|
Robert Sachunsky
|
8be2c79771
|
Revert "deskewing with faster multiprocessing"
This reverts commit 5db3e9fa64 .
|
2025-09-29 17:48:22 +02:00 |
|
Robert Sachunsky
|
abf5c0f845
|
get_smallest_skew: when shifting search range of rotation angle, compare resulting (maximum) variances instead of blindly assuming the new range is better
|
2025-09-29 17:48:22 +02:00 |
|
Robert Sachunsky
|
dc0caad512
|
writer: use @type='heading' instead of 'header'
|
2025-09-29 17:48:22 +02:00 |
|
Robert Sachunsky
|
f458e3ece0
|
writer: SeparatorRegion needs SeparatorRegionType (not ImageRegionType)
|
2025-09-29 17:48:22 +02:00 |
|
Robert Sachunsky
|
4337d62985
|
contours: rename 'pixel' → 'label' for clarity
|
2025-09-29 17:48:22 +02:00 |
|
Robert Sachunsky
|
5b16c2fc00
|
avoid pulling unused 'image_page_rotated' through functions
|
2025-09-29 17:48:22 +02:00 |
|
Robert Sachunsky
|
5bff2d156a
|
use box2rect instead of crop_image_inside_box when no image needed
|
2025-09-29 17:48:22 +02:00 |
|
Robert Sachunsky
|
9b5182c1c0
|
utils: introduce box2rect and box2slice
|
2025-09-29 17:48:19 +02:00 |
|
Robert Sachunsky
|
bca2ae3d78
|
get_marginals: exit early if no peaks found to avoid spurious overlap mask
|
2025-09-29 17:47:51 +02:00 |
|
Robert Sachunsky
|
235539a350
|
filter_contours_without_textline_inside: avoid removing from identical lists twice
|
2025-09-29 17:47:51 +02:00 |
|
Robert Sachunsky
|
11e143afee
|
polygon2contour: avoid overflow
|
2025-09-29 17:47:51 +02:00 |
|
Robert Sachunsky
|
7a9e8256ee
|
increase dilatation: textregions/lines (5→6), seplines (0→1)
|
2025-09-29 17:47:51 +02:00 |
|
Robert Sachunsky
|
f3faa29528
|
refactor shapely converisons into contour2polygon / polygon2contour, also handle heterogeneous geometries
|
2025-09-29 17:47:51 +02:00 |
|
Robert Sachunsky
|
0650274ffa
|
move dilate_*_contours to .utils.contour, rename dilate_textregions_contours_textline_version → dilate_textline_contours
|
2025-09-29 17:47:47 +02:00 |
|
Robert Sachunsky
|
a433c73628
|
filter_contours_area_of_image*: also ensure validity here
|
2025-09-29 17:46:50 +02:00 |
|
Robert Sachunsky
|
17bcf1af71
|
rename *lines_xml → *seplines for clarity
|
2025-09-29 17:46:50 +02:00 |
|
Robert Sachunsky
|
e730725da3
|
check_any_text_region_in_model_one_is_main_or_header_light: return original instead of resampled contours
|
2025-09-29 17:46:50 +02:00 |
|
Robert Sachunsky
|
7b51fd6624
|
avoid creating invalid polygons via rounding
|
2025-09-29 17:46:50 +02:00 |
|
Robert Sachunsky
|
41cc38c51a
|
get_textregion_contours_in_org_image_light: no back rotation, drop slope_first (always 0)
|
2025-09-29 17:46:48 +02:00 |
|
Robert Sachunsky
|
afba70c920
|
separate_lines/do_work_of_slopes: skip if crop is empty
|
2025-09-29 17:44:39 +02:00 |
|
Robert Sachunsky
|
66b2bce8b9
|
return_boxes_of_images_by_order_of_reading_new: log any exceptions
|
2025-09-29 17:44:36 +02:00 |
|
Robert Sachunsky
|
b48c41e68f
|
return_boxes_of_images_by_order_of_reading_new: simplify, avoid changing dtype during np.append
|
2025-09-29 17:42:53 +02:00 |
|
Robert Sachunsky
|
09ece86f0d
|
dilate_textregions_contours: simplify (via shapely's Polygon.buffer()), ensure validity
|
2025-09-29 17:42:53 +02:00 |
|
Konstantin Baierer
|
a6f0af07d1
|
Merge pull request #185 from bertsky/patch-4
CD: master is now main
|
2025-09-29 10:44:27 +02:00 |
|
Robert Sachunsky
|
92c1e824dc
|
CD: master is now main
|
2025-09-26 23:05:47 +02:00 |
|
kba
|
6ea6a62801
|
📝 v0.5.0
|
2025-09-26 16:23:46 +02:00 |
|
Konstantin Baierer
|
882e242946
|
Merge pull request #178 from qurator-spk/prepare-release-v0.5.0
Prepare release v0.5.0
|
2025-09-26 16:21:09 +02:00 |
|
kba
|
37e64b4e45
|
📝 changelog
|
2025-09-26 16:19:04 +02:00 |
|
kba
|
3123add815
|
📝 update README
|
2025-09-26 15:07:32 +02:00 |
|