vahidrezanezhad
|
adcf03c7b7
|
enhancing ocr
|
2025-05-23 18:06:53 +02:00 |
|
vahidrezanezhad
|
d4f6e10251
|
commit 21ec4fb is picked + rnn ocr at the same time with segmentation + enhancement of mb reading order
|
2025-05-23 15:55:03 +02:00 |
|
vahidrezanezhad
|
a0647eff93
|
enhancing curved lines OCR
|
2025-05-21 17:42:44 +02:00 |
|
vahidrezanezhad
|
f94fc9973b
|
Implement hyphenated textline merging in OCR engine and a bug fixed for curved textline OCR
|
2025-05-21 14:39:31 +02:00 |
|
vahidrezanezhad
|
c0835665a9
|
ocr for curved lines
|
2025-05-20 19:01:52 +02:00 |
|
vahidrezanezhad
|
848156dd9d
|
mb reading order now can be done faster. Text regions are clustered using dilation, and mb reading order needs to be implemented for fewer regions
|
2025-05-20 16:51:08 +02:00 |
|
vahidrezanezhad
|
7a34bbb493
|
enhancing marginal detection for light version
|
2025-05-18 02:48:05 +02:00 |
|
vahidrezanezhad
|
0819730355
|
marginals detection enhanced for light version
|
2025-05-15 15:33:50 +02:00 |
|
vahidrezanezhad
|
adee1dc55c
|
enhancement for vertical textlines
|
2025-05-15 00:45:22 +02:00 |
|
vahidrezanezhad
|
a9cdd56e9a
|
enhance ocr for vertical textlines
|
2025-05-14 18:34:58 +02:00 |
|
vahidrezanezhad
|
1ccd3fb7cf
|
Accurately writing text line contours into xml file when the deskewing exceeds 45 degrees and the text line is in light mode
|
2025-05-13 15:53:05 +02:00 |
|
vahidrezanezhad
|
07f5b52fa7
|
The initial attempt at reading heavily deskewed or vertically aligned lines.
|
2025-05-13 14:40:57 +02:00 |
|
vahidrezanezhad
|
02a679a145
|
I have tried to address the issues #163 and #161 . The changes have also improved marginal detection and enhanced the isolation of headers.
|
2025-05-12 00:10:18 +02:00 |
|
vahidrezanezhad
|
5d447abcc4
|
let to add dataset abbrevation to extracted textline images and text
|
2025-05-03 02:59:16 +02:00 |
|
vahidrezanezhad
|
8c8fa461bb
|
machine based model name changed to public one
|
2025-05-02 12:57:26 +02:00 |
|
vahidrezanezhad
|
a4defbb04d
|
inference batch size for ocr is passed as an argument
|
2025-05-02 12:53:33 +02:00 |
|
vahidrezanezhad
|
fd375e15d5
|
adding space between splitted textline predicted text in the case of trocr
|
2025-05-02 01:02:32 +02:00 |
|
vahidrezanezhad
|
5c8084a397
|
displaying detexted text on an image is provided for trocr case
|
2025-05-02 00:30:36 +02:00 |
|
vahidrezanezhad
|
e2da7a6239
|
Fix model name to return the correct machine-based model name
|
2025-04-30 16:06:29 +02:00 |
|
vahidrezanezhad
|
b227736094
|
Fix OCR text cleaning to correctly handle 'U', 'K', and 'N' starting sentence; update text line splitting size
|
2025-04-30 16:04:34 +02:00 |
|
vahidrezanezhad
|
4cb4414740
|
Resolve remaining issue with #158 and resolving #124
|
2025-04-30 16:01:52 +02:00 |
|
vahidrezanezhad
|
208bde706f
|
resolving issue #158
|
2025-04-30 13:55:09 +02:00 |
|
Konstantin Baierer
|
3e8adb86c2
|
Merge pull request #157 from qurator-spk/kba-patch-1
CI: Use most recent actions/setup-python@v5
|
2025-04-29 11:42:18 +02:00 |
|
Konstantin Baierer
|
77dae129d5
|
CI: Use most recent actions/setup-python@v5
|
2025-04-22 13:22:28 +02:00 |
|
Clemens Neudecker
|
b4df978dd5
|
Merge pull request #154 from qurator-spk/ci-pypi
CI: pypi
|
2025-04-17 17:01:20 +02:00 |
|
kba
|
30ba234641
|
CI: pypi
|
2025-04-16 19:27:17 +02:00 |
|
kba
|
41318f0404
|
📝 changelog
|
2025-04-15 11:14:26 +02:00 |
|
vahidrezanezhad
|
a22df11ebb
|
Restoring the contour in the original image caused an error due to an empty tuple. This issue has been resolved, and as expected, the confidence score for this contour is set to zero
|
2025-04-14 00:42:08 +02:00 |
|
kba
|
8080bd823c
|
📦 v0.4.0
|
2025-04-07 16:48:57 +02:00 |
|
Robert Sachunsky
|
bcf1898aa4
|
📝 changelog
|
2025-04-07 16:46:58 +02:00 |
|
Robert Sachunsky
|
177e017167
|
test_run: ensure exceptions are shown
|
2025-04-07 10:39:50 +00:00 |
|
vahidrezanezhad
|
e2907f67e0
|
'from PIL.Image import Image' causes an error when using Image.new(), and since Image is already imported, this line can be safely commented out.
|
2025-04-06 00:33:36 +02:00 |
|
Robert Sachunsky
|
132d3e3d27
|
CI: use clash-free artifact name for report upload
|
2025-04-05 11:36:21 +02:00 |
|
Robert Sachunsky
|
dc64079b6b
|
CI: fix coverage report calls
|
2025-04-05 03:40:02 +02:00 |
|
Robert Sachunsky
|
7609c64c8b
|
CI: make coverage cfg work with both editable and dist install
|
2025-04-05 03:05:26 +02:00 |
|
Robert Sachunsky
|
bbc06dbbc1
|
CI: forgot to (re-)enable verbose logging
|
2025-04-05 02:10:52 +02:00 |
|
Robert Sachunsky
|
a41f18b13d
|
CI: (try to) store/upload coverage results
|
2025-04-05 01:34:28 +02:00 |
|
Robert Sachunsky
|
4339444e47
|
binarization CLI: fix option checks, simplify to asserts, fix dir_in mode
|
2025-04-05 01:21:08 +02:00 |
|
Robert Sachunsky
|
56cc179d35
|
pytest: add tests for directory mode (layout+bin)
|
2025-04-05 01:20:38 +02:00 |
|
Robert Sachunsky
|
a3e1b3d4d5
|
pytest: add asserts for results, add binarization
|
2025-04-04 23:37:00 +02:00 |
|
Robert Sachunsky
|
b03116f4a6
|
pytest: use subtests for various layout options, add coverage
|
2025-04-04 22:22:50 +02:00 |
|
Robert Sachunsky
|
91a340f619
|
CLI: simplify option checks to asserts (also avoid stack trace)
|
2025-04-04 20:42:28 +02:00 |
|
Robert Sachunsky
|
e0a7fde537
|
logger: fix type hint
|
2025-04-04 20:27:15 +02:00 |
|
Robert Sachunsky
|
108ce1f5a1
|
Merge remote-tracking branch 'origin/main' into v3-api-release-foreal
(bad-ass difficult diff diffing)
|
2025-04-04 20:23:23 +02:00 |
|
Konstantin Baierer
|
e0d38517d3
|
Merge pull request #130 from qurator-spk/v3-api
port processor to core v3
|
2025-04-04 16:01:45 +02:00 |
|
vahidrezanezhad
|
2e3a29f66b
|
In light mode: To determine whether a main region is a header, I adjusted the ratio to achieve better results.
|
2025-04-04 15:36:31 +02:00 |
|
Konstantin Baierer
|
85566c2186
|
Merge pull request #148 from bertsky/v3-api
fix, merge, resolve conflicts, apply review, migrate sbb-binarize
|
2025-04-04 13:31:00 +02:00 |
|
Robert Sachunsky
|
1a0b9d1958
|
Merge pull request #1 from bertsky/v3-api-refactor-init
refactoring of Eynollah init and model loading
|
2025-04-04 13:30:23 +02:00 |
|
vahidrezanezhad
|
38a2d60fa2
|
Confidence value for textregions and in the case of not light version is set to zero. This is done to let the pipeline go through. It will be updated to return the correct value in upcomming commits
|
2025-04-03 12:47:27 +02:00 |
|
vahidrezanezhad
|
6b52da227c
|
docorating eynollah with textregion confidence score #135
|
2025-04-03 00:39:21 +02:00 |
|