eynollah

qurator-spk/eynollah

Fork 0

mirror of https://github.com/qurator-spk/eynollah.git synced 2025-08-01 14:20:00 +02:00

Commit graph

1a1170ab5d

Merge 2996fc8b30 into 6b8893b188 vahidrezanezhad 2025-07-24 13:31:01 +00:00
2996fc8b30

Merge pull request #166 from qurator-spk/updating_readme_for_eynollah_use_cases-cli updating_readme_for_eynollah_use_cases Clemens Neudecker 2025-07-24 15:30:57 +02:00
fd0595f920

Update Makefile updating_readme_for_eynollah_use_cases-cli vahidrezanezhad 2025-07-24 13:52:38 +02:00
611a521045

Merge 3b1886973d into 6b8893b188 Konstantin Baierer 2025-07-24 14:19:10 +10:00
da141bb42e resolving tests error vahidrezanezhad 2025-07-23 16:44:17 +02:00
83d77b4aab

Merge 42a3cc2335 into 6b8893b188 Robert Sachunsky 2025-07-22 14:51:52 +02:00
6b8893b188

Merge pull request #167 from qurator-spk/ocrd-fixes main vahidrezanezhad 2025-07-22 14:46:25 +02:00
42a3cc2335 cv2pil: limit color depth on output imgs Robert Sachunsky 2025-06-13 01:21:25 +02:00
dadb879376 pil2cv: allow (and drop) alpha channels on input imgs Robert Sachunsky 2025-06-12 20:55:37 +02:00
b7b218ff11 OCR-D processor: same behavior as standalone wrt light_version/textline_light ocrd-fixes kba 2025-06-12 15:30:17 +02:00
c194a20c9c Fixed duplicate textline_light assignments (true and false) in the OCR-D framework for the Eynollah light version, which caused rectangles to be used instead of contours for textlines vahidrezanezhad 2025-06-11 18:57:08 +02:00
32889ef1e0 adapt binarization CLI according to #156 kba 2025-06-12 13:57:41 +02:00
9b4e78c55c

Fixed duplicate textline_light assignments (true and false) in the OCR-D framework for the Eynollah light version, which caused rectangles to be used instead of contours for textlines vahidrezanezhad 2025-06-11 18:57:08 +02:00
7a22e51f5d resolve some comments from review cneud 2025-05-14 21:56:03 +02:00
21ec4fbfb5 The text region coordinates are now correctly written into the XML output when using the skip layout and reading order option vahidrezanezhad 2025-05-07 14:04:01 +02:00
83211ae684 In the case of skip_layout_and_reading_order, the confidence value was not set correctly, leading to an error while writing to the XML file. vahidrezanezhad 2025-05-07 12:33:03 +02:00
3dcbb20cac

Merge pull request #159 from bertsky/main Clemens Neudecker 2025-05-06 15:14:06 +02:00
e9179e1d34 docker: use latest core base stage Robert Sachunsky 2025-05-02 00:13:06 +02:00
f8b4d29a59 docker: prepackage ocrd-all-module-dir.json Robert Sachunsky 2025-05-02 00:13:11 +02:00
e2da7a6239 Fix model name to return the correct machine-based model name vahidrezanezhad 2025-04-30 16:06:29 +02:00
b227736094 Fix OCR text cleaning to correctly handle 'U', 'K', and 'N' starting sentence; update text line splitting size vahidrezanezhad 2025-04-30 16:04:34 +02:00
4cb4414740 Resolve remaining issue with #158 and resolving #124 vahidrezanezhad 2025-04-30 16:01:52 +02:00
208bde706f resolving issue #158 vahidrezanezhad 2025-04-30 13:55:09 +02:00
3e8adb86c2

Merge pull request #157 from qurator-spk/kba-patch-1 Konstantin Baierer 2025-04-29 11:42:18 +02:00
77dae129d5

CI: Use most recent actions/setup-python@v5 Konstantin Baierer 2025-04-22 13:22:28 +02:00
192b9111e3 updating eynollah README, how to use it for use cases vahidrezanezhad 2025-04-22 00:23:01 +02:00
b4df978dd5

Merge pull request #154 from qurator-spk/ci-pypi Clemens Neudecker 2025-04-17 17:01:20 +02:00
30ba234641 CI: pypi kba 2025-04-16 19:27:17 +02:00
41318f0404 📝 changelog kba 2025-04-15 11:14:26 +02:00
a22df11ebb Restoring the contour in the original image caused an error due to an empty tuple. This issue has been resolved, and as expected, the confidence score for this contour is set to zero vahidrezanezhad 2025-04-14 00:42:08 +02:00
3b1886973d gha snippets from codecov sample page test-codecov kba 2025-04-08 14:52:38 +02:00
60a05711bb test kba 2025-04-08 14:46:22 +02:00
8080bd823c 📦 v0.4.0 v0.4.0 kba 2025-04-07 16:48:57 +02:00
bcf1898aa4 📝 changelog Robert Sachunsky 2025-04-07 16:46:58 +02:00
177e017167 test_run: ensure exceptions are shown Robert Sachunsky 2025-04-06 18:24:56 +00:00
e2907f67e0 'from PIL.Image import Image' causes an error when using Image.new(), and since Image is already imported, this line can be safely commented out. vahidrezanezhad 2025-04-06 00:33:36 +02:00
132d3e3d27 CI: use clash-free artifact name for report upload Robert Sachunsky 2025-04-05 11:36:21 +02:00
dc64079b6b CI: fix coverage report calls Robert Sachunsky 2025-04-05 03:40:02 +02:00
7609c64c8b CI: make coverage cfg work with both editable and dist install Robert Sachunsky 2025-04-05 03:05:26 +02:00
bbc06dbbc1 CI: forgot to (re-)enable verbose logging Robert Sachunsky 2025-04-05 02:10:52 +02:00
a41f18b13d CI: (try to) store/upload coverage results Robert Sachunsky 2025-04-05 01:34:28 +02:00
4339444e47 binarization CLI: fix option checks, simplify to asserts, fix dir_in mode Robert Sachunsky 2025-04-05 01:21:08 +02:00
56cc179d35 pytest: add tests for directory mode (layout+bin) Robert Sachunsky 2025-04-04 23:48:30 +02:00
a3e1b3d4d5 pytest: add asserts for results, add binarization Robert Sachunsky 2025-04-04 23:37:00 +02:00
b03116f4a6 pytest: use subtests for various layout options, add coverage Robert Sachunsky 2025-04-04 22:22:50 +02:00
91a340f619 CLI: simplify option checks to asserts (also avoid stack trace) Robert Sachunsky 2025-04-04 20:42:28 +02:00
e0a7fde537 logger: fix type hint Robert Sachunsky 2025-04-04 20:27:15 +02:00
108ce1f5a1 Merge remote-tracking branch 'origin/main' into v3-api-release-foreal Robert Sachunsky 2025-04-04 20:23:23 +02:00
5c45cb4aee Merge remote-tracking branch 'origin/main' into v3-api-release kba 2025-04-04 17:20:49 +02:00
e0d38517d3

Merge pull request #130 from qurator-spk/v3-api Konstantin Baierer 2025-04-04 16:01:45 +02:00
2e3a29f66b In light mode: To determine whether a main region is a header, I adjusted the ratio to achieve better results. vahidrezanezhad 2025-04-04 15:36:31 +02:00
85566c2186

Merge pull request #148 from bertsky/v3-api Konstantin Baierer 2025-04-04 13:31:00 +02:00
1a0b9d1958

Merge pull request #1 from bertsky/v3-api-refactor-init Robert Sachunsky 2025-04-04 13:30:23 +02:00
e9a4324b8f Merge branch 'v3-api-refactor-init' into bertsky-v3-api kba 2025-04-04 13:14:12 +02:00
38a2d60fa2 Confidence value for textregions and in the case of not light version is set to zero. This is done to let the pipeline go through. It will be updated to return the correct value in upcomming commits vahidrezanezhad 2025-04-03 12:47:27 +02:00
6b52da227c docorating eynollah with textregion confidence score #135 vahidrezanezhad 2025-04-03 00:39:21 +02:00
559d001eef another fix to avoid frequent warnings Robert Sachunsky 2025-04-02 05:45:34 +00:00
dd478279a4 CLI: also --overwrite in single-image mode Robert Sachunsky 2025-04-02 05:40:21 +00:00
8159e6336a fix typo (preventing log messages) Robert Sachunsky 2025-04-02 00:01:02 +00:00
2919538382 minor fixes to avoid frequent warnings Robert Sachunsky 2025-04-01 23:33:26 +00:00
903c87aca0 update readme (OCR-D section) Robert Sachunsky 2025-04-01 23:26:38 +02:00
dcf2ed5e22 run: also write out XML in single filename mode Robert Sachunsky 2025-04-01 23:13:24 +02:00
fe77171d45 run_single: reduce indentation Robert Sachunsky 2025-04-01 22:47:33 +02:00
c7dc952851 smoke-test: also test dir-in mode and overwrite Robert Sachunsky 2025-04-01 22:43:30 +02:00
79003a083c CLI: ValueError instead of print+exit Robert Sachunsky 2025-04-01 22:43:01 +02:00
e17d34fafa factor run_single() out of run(), simplify kwargs Robert Sachunsky 2025-04-01 22:12:24 +02:00
1a0a1cb00b remove session methods and redundant model loaders Robert Sachunsky 2025-04-01 21:15:41 +02:00
ab3da17547

Update requirements.txt Robert Sachunsky 2025-04-01 18:13:28 +02:00
dd51f900b9 OCR-D: init Eynollah in 'setup', re-use instance for each page via non-public API Robert Sachunsky 2025-04-01 13:02:30 +02:00
ffeb4a343d Eynollah: remove useless 'pcgts' attr Robert Sachunsky 2025-04-01 13:00:41 +02:00
9dc33db108 CI: add binarization models to cache Robert Sachunsky 2025-04-01 11:36:56 +02:00
9c769d4cc5 CI: run CLI tests, too Robert Sachunsky 2025-04-01 11:13:16 +02:00
250fc02606 add tests for binarization, remove dependency on deps-test Robert Sachunsky 2025-04-01 11:13:04 +02:00
91b2201b07 cnnrnn Ocr: width of input textline image can not be zero! vahidrezanezhad 2025-04-01 10:55:40 +02:00
515b4023f6 sbb_binarize: fix missing reference Robert Sachunsky 2025-04-01 10:54:36 +02:00
95a681aa8c add Continuous Deployment via Dockerhub and GHCR Robert Sachunsky 2025-04-01 01:27:10 +02:00
df3510750c

Github Actions CI: no more Docker clean or build Robert Sachunsky 2025-04-01 00:28:16 +02:00
45e3ab9692

Github Actions: free space: all existing Docker images Robert Sachunsky 2025-04-01 00:23:53 +02:00
4de441eaaa OCR prediction is now enabled to integrate results from both RGB and binarized images or to be performed on each individually vahidrezanezhad 2025-03-31 21:28:05 +02:00
b1da0a3327 In OCR, the predicted text is now drawn on the image, and the results are saved in a specified directory. This makes it easier to review the predicted output vahidrezanezhad 2025-03-31 18:43:14 +02:00
31aeb9629d

Github Actions: free space more aggressively Robert Sachunsky 2025-03-31 18:16:17 +02:00
7430b57b65 dockerfile: add smoke test Robert Sachunsky 2025-03-31 16:56:47 +02:00
f35f49376e run CLI test in TMPDIR, add ocrd-test Robert Sachunsky 2025-03-31 16:55:57 +02:00
ae066388ea docker: no need for g++, but install w/ 'EXTRAS=OCR' Robert Sachunsky 2025-03-31 15:58:57 +02:00
722b5c6bf1 add make variable EXTRAS for optional dependencies Robert Sachunsky 2025-03-31 15:58:12 +02:00
c01609ff4e allow even more empty imports for optional dependencies Robert Sachunsky 2025-03-31 15:57:22 +02:00
51e9bfd6d7 improve+extend dockerfile Robert Sachunsky 2025-03-31 14:14:08 +02:00
09248d4829 improve+extend makefile Robert Sachunsky 2025-03-31 14:13:16 +02:00
46618f4229 allow more empty imports for optional dependencies Robert Sachunsky 2025-03-31 14:11:50 +02:00
4be89910a2 CLI: fix arg vs kwarg from merge Robert Sachunsky 2025-03-31 02:38:24 +02:00
9d61acf173 simplify Robert Sachunsky 2025-03-31 02:02:30 +02:00
a1068ff2eb OCR-D: move sbb-binarize to ocrd-tool.json, update to v3 Robert Sachunsky 2025-03-31 01:47:32 +02:00
c794d4d29f OCR-D: fix typo light_mode→light_version Robert Sachunsky 2025-03-31 01:46:29 +02:00
4338259ca1 OCR-D: ensure page image gets replaced in result as well if not the original file Robert Sachunsky 2025-03-31 01:17:14 +02:00
55969b0173 OCR-D: add docstring Robert Sachunsky 2025-03-31 01:15:26 +02:00
3916474b8b OCR-D: require >=v3.1 Robert Sachunsky 2025-03-31 01:15:12 +02:00
6d02e90570 OCR-D: restrict max_workers=1 Robert Sachunsky 2025-03-31 01:14:54 +02:00
efd3fa6775 allow empty imports for optional dependencies Robert Sachunsky 2025-03-31 00:32:26 +02:00
238132e260 use 'image_filename' for pseudo-iteration outside 'dir_in' mode Robert Sachunsky 2025-03-31 00:31:49 +02:00
af4e2a4ffc do not require 'dir_out' outside 'dir_in' mode Robert Sachunsky 2025-03-31 00:31:09 +02:00