vahidrezanezhad
|
6904a98182
|
get textlines inside textregion sorted debugging
|
2025-09-24 17:17:12 +02:00 |
|
vahidrezanezhad
|
ce13d8c5a3
|
get textlines inside textregion sorted
|
2025-09-24 17:16:47 +02:00 |
|
Konstantin Baierer
|
9c129c7f54
|
Merge pull request #180 from bertsky/prepare-release-v0.5.0-fixlogging
prepare release v0.5.0: fix logging
|
2025-09-24 12:28:10 +02:00 |
|
Robert Sachunsky
|
5bd318e657
|
rm print statement (already log msg)
|
2025-09-24 12:14:32 +02:00 |
|
Robert Sachunsky
|
90f1d7aa47
|
rm summary msg (info already logged elsewhere)
|
2025-09-24 12:10:11 +02:00 |
|
Robert Sachunsky
|
7933b103f5
|
log modes only once (in run, not in run_single)
|
2025-09-24 12:09:30 +02:00 |
|
Robert Sachunsky
|
d0817f5744
|
fix typo
|
2025-09-24 12:08:50 +02:00 |
|
kba
|
9ead58b99a
|
Merge remote-tracking branch 'michalbubula/add-feedback' into prepare-release-v0.5.0
|
2025-09-23 19:50:27 +02:00 |
|
kba
|
7bde99e866
|
Merge remote-tracking branch 'origin/updating_readme_for_eynollah_use_cases' into prepare-release-v0.5.0
|
2025-09-23 19:42:55 +02:00 |
|
kba
|
df8d93dbfa
|
Merge branch 'main' into add-feedback
|
2025-09-23 19:20:20 +02:00 |
|
vahidrezanezhad
|
554f3988c9
|
default cnn-rnn and transformer ocr models have changed to model_eynollah_ocr_cnnrnn_20250904 and model_eynollah_ocr_trocr_20250919 respectively
|
2025-09-21 16:33:14 +02:00 |
|
vahidrezanezhad
|
6bbdfe1074
|
extending image types
|
2025-09-21 02:32:40 +02:00 |
|
vahidrezanezhad
|
e97e3ab192
|
Merge text of textlines and handle hyphenated words by joining them correctly
|
2025-09-19 23:23:30 +02:00 |
|
vahidrezanezhad
|
b38331b4ab
|
writing page contour correctly in xml output + ignore unsupported file types when loading images
|
2025-09-19 18:06:18 +02:00 |
|
vahidrezanezhad
|
994bc8a1c0
|
debug new page extraction in the case of ignoring page extraction
|
2025-09-19 15:24:34 +02:00 |
|
kba
|
5c9cf8472b
|
remove redundant/brittle interval logging
|
2025-09-18 13:19:57 +02:00 |
|
kba
|
146102842a
|
convert all print stmts to logger.info calls
|
2025-09-18 13:15:18 +02:00 |
|
kba
|
c64d102613
|
move logging to CLI and make initialization optional
|
2025-09-18 13:07:41 +02:00 |
|
vahidrezanezhad
|
310679eeb8
|
page extraction model name is changed
|
2025-09-16 14:27:15 +02:00 |
|
vahidrezanezhad
|
542646791d
|
For TrOCR, the cropped text lines will no longer be added to a list before prediction. Instead, for each batch size, the text line images will be collected and predictions will be made directly on them.
|
2025-09-23 19:03:13 +02:00 |
|
vahidrezanezhad
|
0711166524
|
changed the drop capitals bonding box to contour ratio threshold
|
2025-09-01 11:37:22 +02:00 |
|
vahidrezanezhad
|
e15640aa8a
|
new page extraction model integration
|
2025-09-15 13:36:58 +02:00 |
|
vahidrezanezhad
|
6a735daa60
|
Update README.md
|
2025-08-31 23:30:54 +02:00 |
|
vahidrezanezhad
|
9b9d21d8ac
|
eynollah ocr: support using either a specific model name or a models directory (default model)
|
2025-08-28 11:30:59 +02:00 |
|
vahidrezanezhad
|
41365645ef
|
Marginals are divided into left and right, and written from top to bottom.
|
2025-08-26 22:38:03 +02:00 |
|
vahidrezanezhad
|
7741502876
|
reading order on given layout
|
2025-08-18 02:31:13 +02:00 |
|
Clemens Neudecker
|
a2359ea4c4
|
Merge pull request #171 from bertsky/ocrd-machine-based-ro
OCR-D processor: expose reading_order_machine_based
|
2025-08-15 18:40:13 +02:00 |
|
Robert Sachunsky
|
21615a986d
|
OCR-D processor: expose reading_order_machine_based
|
2025-08-13 14:14:37 +02:00 |
|
michalbubula
|
8ebba5ac04
|
add feedback to command line interface
|
2025-08-12 16:21:15 +02:00 |
|
vahidrezanezhad
|
268aa141d7
|
avoiding float in range
|
2025-08-12 12:50:15 +02:00 |
|
vahidrezanezhad
|
52d9cc9baf
|
deskewing with faster multiprocessing
|
2025-08-08 11:32:02 +02:00 |
|
vahidrezanezhad
|
322b04145f
|
use the latest ocr model with balanced fraktur-antiqua training dataset
|
2025-08-05 14:22:22 +02:00 |
|
vahidrezanezhad
|
1b95f8f38d
|
threshold for textline ocr + new ocr model
|
2025-07-25 13:18:38 +02:00 |
|
Clemens Neudecker
|
2996fc8b30
|
Merge pull request #166 from qurator-spk/updating_readme_for_eynollah_use_cases-cli
Updating readme for eynollah use cases cli
|
2025-07-24 15:30:57 +02:00 |
|
vahidrezanezhad
|
fd0595f920
|
Update Makefile
|
2025-07-24 13:52:38 +02:00 |
|
vahidrezanezhad
|
da141bb42e
|
resolving tests error
|
2025-07-23 16:44:17 +02:00 |
|
vahidrezanezhad
|
6b8893b188
|
Merge pull request #167 from qurator-spk/ocrd-fixes
Ocrd fixes
|
2025-07-22 14:46:25 +02:00 |
|
vahidrezanezhad
|
daa597dbaa
|
should merged text for the whole page be written in xml?
|
2025-07-21 14:50:05 +02:00 |
|
vahidrezanezhad
|
673e67a847
|
update model names
|
2025-07-21 10:54:20 +02:00 |
|
vahidrezanezhad
|
fee40049cd
|
ocr model renamed - image text font for ocr result is now using Charis-7.000 font (downloaded from here https://software.sil.org/charis/download/)
|
2025-07-16 14:00:12 +02:00 |
|
vahidrezanezhad
|
04fead348f
|
ocr: make sure that image height or width is not zero
|
2025-07-03 15:24:52 +02:00 |
|
vahidrezanezhad
|
53dd4b26a9
|
decorated with confidence value for cnnrnn ocr model
|
2025-07-03 11:50:47 +02:00 |
|
kba
|
b7b218ff11
|
OCR-D processor: same behavior as standalone wrt light_version/textline_light
|
2025-06-12 15:30:17 +02:00 |
|
vahidrezanezhad
|
c194a20c9c
|
Fixed duplicate textline_light assignments (true and false) in the OCR-D framework for the Eynollah light version, which caused rectangles to be used instead of contours for textlines
|
2025-06-12 15:27:22 +02:00 |
|
kba
|
32889ef1e0
|
adapt binarization CLI according to #156
|
2025-06-12 13:57:41 +02:00 |
|
vahidrezanezhad
|
9b4e78c55c
|
Fixed duplicate textline_light assignments (true and false) in the OCR-D framework for the Eynollah light version, which caused rectangles to be used instead of contours for textlines
|
2025-06-11 18:57:08 +02:00 |
|
vahidrezanezhad
|
f79af201ab
|
Fix: Resolved OCR bug when text region type is undefined
|
2025-06-02 18:21:33 +02:00 |
|
vahidrezanezhad
|
e26c4ab9b4
|
image enhancer updated
|
2025-06-01 22:44:50 +02:00 |
|
vahidrezanezhad
|
9342b76038
|
saving enhanced image in org or scaled resolution
|
2025-06-01 22:10:13 +02:00 |
|
vahidrezanezhad
|
3b475915c7
|
image enhancer is integrated
|
2025-06-01 15:53:04 +02:00 |
|