vahidrezanezhad
|
53dd4b26a9
|
decorated with confidence value for cnnrnn ocr model
|
2025-07-03 11:50:47 +02:00 |
|
vahidrezanezhad
|
f79af201ab
|
Fix: Resolved OCR bug when text region type is undefined
|
2025-06-02 18:21:33 +02:00 |
|
vahidrezanezhad
|
e26c4ab9b4
|
image enhancer updated
|
2025-06-01 22:44:50 +02:00 |
|
vahidrezanezhad
|
9342b76038
|
saving enhanced image in org or scaled resolution
|
2025-06-01 22:10:13 +02:00 |
|
vahidrezanezhad
|
3b475915c7
|
image enhancer is integrated
|
2025-06-01 15:53:04 +02:00 |
|
vahidrezanezhad
|
df903aa1b4
|
Parametrize OCR for handling curved lines
|
2025-05-31 01:09:14 +02:00 |
|
vahidrezanezhad
|
1e7cecfcf9
|
updating ocr
|
2025-05-28 01:17:21 +02:00 |
|
vahidrezanezhad
|
03f52e7a46
|
updating ocr
|
2025-05-27 23:45:22 +02:00 |
|
vahidrezanezhad
|
31d9fa0c80
|
strings alignment function is added + new changes needed for prediction with both bin and rgb inputs is implemented
|
2025-05-25 21:44:36 +02:00 |
|
vahidrezanezhad
|
b18691f96a
|
rnn ocr for all layout textregion types
|
2025-05-25 03:33:54 +02:00 |
|
vahidrezanezhad
|
ba3420b2d8
|
Drop capitals are written separately and are not attached to their corresponding text line. The OCR use case also supports single-image input.
|
2025-05-25 01:12:58 +02:00 |
|
vahidrezanezhad
|
0250a6d3d0
|
enhancing ocr
|
2025-05-23 18:06:53 +02:00 |
|
vahidrezanezhad
|
089029cec7
|
commit 21ec4fb is picked + rnn ocr at the same time with segmentation + enhancement of mb reading order
|
2025-05-23 15:55:03 +02:00 |
|
vahidrezanezhad
|
ee2c7e9013
|
enhancing curved lines OCR
|
2025-05-21 17:42:44 +02:00 |
|
vahidrezanezhad
|
14b70c2556
|
Implement hyphenated textline merging in OCR engine and a bug fixed for curved textline OCR
|
2025-05-21 14:39:31 +02:00 |
|
vahidrezanezhad
|
3ad621e956
|
ocr for curved lines
|
2025-05-20 19:01:52 +02:00 |
|
vahidrezanezhad
|
44ff51f5c1
|
mb reading order now can be done faster. Text regions are clustered using dilation, and mb reading order needs to be implemented for fewer regions
|
2025-05-20 16:51:08 +02:00 |
|
vahidrezanezhad
|
5016039cd7
|
enhancing marginal detection for light version
|
2025-05-18 02:48:05 +02:00 |
|
vahidrezanezhad
|
1cbc669d36
|
marginals detection enhanced for light version
|
2025-05-15 15:33:50 +02:00 |
|
vahidrezanezhad
|
1b229ba7ae
|
enhancement for vertical textlines
|
2025-05-15 00:45:22 +02:00 |
|
vahidrezanezhad
|
ed46615f00
|
enhance ocr for vertical textlines
|
2025-05-14 18:34:58 +02:00 |
|
vahidrezanezhad
|
88e0315321
|
Accurately writing text line contours into xml file when the deskewing exceeds 45 degrees and the text line is in light mode
|
2025-05-13 15:53:05 +02:00 |
|
vahidrezanezhad
|
54088c6b04
|
The initial attempt at reading heavily deskewed or vertically aligned lines.
|
2025-05-13 14:40:57 +02:00 |
|
vahidrezanezhad
|
c12b09a868
|
I have tried to address the issues #163 and #161 . The changes have also improved marginal detection and enhanced the isolation of headers.
|
2025-05-12 00:10:18 +02:00 |
|
vahidrezanezhad
|
89aa545049
|
let to add dataset abbrevation to extracted textline images and text
|
2025-05-03 02:59:16 +02:00 |
|
vahidrezanezhad
|
48e8dd4ab3
|
machine based model name changed to public one
|
2025-05-02 12:57:26 +02:00 |
|
vahidrezanezhad
|
a1a004b19d
|
inference batch size for ocr is passed as an argument
|
2025-05-02 12:53:33 +02:00 |
|
vahidrezanezhad
|
5d8c864c08
|
adding space between splitted textline predicted text in the case of trocr
|
2025-05-02 01:02:32 +02:00 |
|
vahidrezanezhad
|
184af46664
|
displaying detexted text on an image is provided for trocr case
|
2025-05-02 00:30:36 +02:00 |
|
Robert Sachunsky
|
21615a986d
|
OCR-D processor: expose reading_order_machine_based
|
2025-08-13 14:14:37 +02:00 |
|
kba
|
b7b218ff11
|
OCR-D processor: same behavior as standalone wrt light_version/textline_light
|
2025-06-12 15:30:17 +02:00 |
|
vahidrezanezhad
|
c194a20c9c
|
Fixed duplicate textline_light assignments (true and false) in the OCR-D framework for the Eynollah light version, which caused rectangles to be used instead of contours for textlines
|
2025-06-12 15:27:22 +02:00 |
|
vahidrezanezhad
|
e2da7a6239
|
Fix model name to return the correct machine-based model name
|
2025-04-30 16:06:29 +02:00 |
|
vahidrezanezhad
|
b227736094
|
Fix OCR text cleaning to correctly handle 'U', 'K', and 'N' starting sentence; update text line splitting size
|
2025-04-30 16:04:34 +02:00 |
|
vahidrezanezhad
|
4cb4414740
|
Resolve remaining issue with #158 and resolving #124
|
2025-04-30 16:01:52 +02:00 |
|
vahidrezanezhad
|
208bde706f
|
resolving issue #158
|
2025-04-30 13:55:09 +02:00 |
|
vahidrezanezhad
|
a22df11ebb
|
Restoring the contour in the original image caused an error due to an empty tuple. This issue has been resolved, and as expected, the confidence score for this contour is set to zero
|
2025-04-14 00:42:08 +02:00 |
|
kba
|
8080bd823c
|
📦 v0.4.0
|
2025-04-07 16:48:57 +02:00 |
|
vahidrezanezhad
|
e2907f67e0
|
'from PIL.Image import Image' causes an error when using Image.new(), and since Image is already imported, this line can be safely commented out.
|
2025-04-06 00:33:36 +02:00 |
|
Robert Sachunsky
|
4339444e47
|
binarization CLI: fix option checks, simplify to asserts, fix dir_in mode
|
2025-04-05 01:21:08 +02:00 |
|
Robert Sachunsky
|
91a340f619
|
CLI: simplify option checks to asserts (also avoid stack trace)
|
2025-04-04 20:42:28 +02:00 |
|
Robert Sachunsky
|
e0a7fde537
|
logger: fix type hint
|
2025-04-04 20:27:15 +02:00 |
|
Robert Sachunsky
|
108ce1f5a1
|
Merge remote-tracking branch 'origin/main' into v3-api-release-foreal
(bad-ass difficult diff diffing)
|
2025-04-04 20:23:23 +02:00 |
|
vahidrezanezhad
|
2e3a29f66b
|
In light mode: To determine whether a main region is a header, I adjusted the ratio to achieve better results.
|
2025-04-04 15:36:31 +02:00 |
|
vahidrezanezhad
|
38a2d60fa2
|
Confidence value for textregions and in the case of not light version is set to zero. This is done to let the pipeline go through. It will be updated to return the correct value in upcomming commits
|
2025-04-03 12:47:27 +02:00 |
|
vahidrezanezhad
|
6b52da227c
|
docorating eynollah with textregion confidence score #135
|
2025-04-03 00:39:21 +02:00 |
|
Robert Sachunsky
|
559d001eef
|
another fix to avoid frequent warnings
|
2025-04-02 05:45:34 +00:00 |
|
Robert Sachunsky
|
dd478279a4
|
CLI: also --overwrite in single-image mode
|
2025-04-02 05:40:21 +00:00 |
|
Robert Sachunsky
|
8159e6336a
|
fix typo (preventing log messages)
|
2025-04-02 00:01:02 +00:00 |
|
Robert Sachunsky
|
2919538382
|
minor fixes to avoid frequent warnings
|
2025-04-01 23:33:26 +00:00 |
|