|
5ece7f1b0a
|
🧹 Remove remnants of ocrd-ocropy-segment
|
2020-02-07 14:01:53 +01:00 |
|
|
135489eaeb
|
🧹 Remove page_downgrade_to_2018
|
2020-02-07 13:59:55 +01:00 |
|
|
423d9c2ed6
|
🚧 do_validate: Skip dimension checking
|
2020-02-07 13:59:19 +01:00 |
|
|
948e9074df
|
⬆ Update to ocrd_calamari 0.0.4
|
2020-02-07 13:31:26 +01:00 |
|
|
1ef850992c
|
🎨 Use same style of specifying parameters for all processors
|
2020-02-07 13:20:18 +01:00 |
|
|
b468d688f2
|
🧹 Remove font identification for now
|
2020-02-07 12:30:39 +01:00 |
|
|
07555e8270
|
🎨 Use new OCR-D JSON string parameters
|
2020-02-07 12:24:51 +01:00 |
|
|
9c31d604e9
|
⬆ Update ocrd-sbb-textline-detector command
|
2020-01-16 16:34:03 +01:00 |
|
|
fd56731464
|
🚧 Do not check PAGE coordinates for now
|
2020-01-16 16:33:36 +01:00 |
|
|
87a2bce93c
|
⬆ Update calamari-models URL + path
|
2020-01-16 15:46:43 +01:00 |
|
|
d166077a55
|
✨ Update to sbb_textline_detector with the fixed AlternativeImage support (= merged PAGE results)
|
2019-11-20 12:40:05 +01:00 |
|
|
de47a3e5b1
|
🔥 Remove now unused page_fix_image_references()
|
2019-11-20 12:39:02 +01:00 |
|
|
1af18c629e
|
🧹 Validate imagefilename again
|
2019-10-30 11:25:34 +01:00 |
|
|
de49aa715b
|
⬆ Update to OCR-D 1.0.0
|
2019-10-21 17:04:49 +02:00 |
|
|
7025d960b4
|
✨ Use ocrd_olena for binarization
|
2019-10-21 17:04:06 +02:00 |
|
|
3687d6d7b4
|
🧹 Do not remove line confidences anymore
|
2019-10-11 19:17:30 +02:00 |
|
|
6454d20998
|
✨ Use sbb_textline_detector to segment lines
|
2019-10-11 19:16:43 +02:00 |
|
|
bdab016e2c
|
✨ Use GT4HistOCR_2000000 model from qurator-data for Tesseract
|
2019-10-02 16:48:28 +02:00 |
|
|
47dd5d3b62
|
🎨 Move XML schemata to a better path
|
2019-09-30 18:25:54 +02:00 |
|
|
af2034400a
|
🎨 Add extra newlines to separate steps
|
2019-09-30 12:26:14 +02:00 |
|
|
1863439d92
|
💩 Remove extra Pillow dependency workarounds
|
2019-09-30 12:25:31 +02:00 |
|
|
e5cd5b937e
|
✨ Run pip3 list for easier checking
|
2019-09-27 13:16:14 +02:00 |
|
|
bd24624bd7
|
⬆ Do not downgrade to PAGE 2018 anymore
|
2019-09-27 13:02:46 +02:00 |
|
|
0b2b66a0b4
|
🔧 Allow setting LOG_LEVEL
|
2019-09-27 12:09:37 +02:00 |
|
|
f19bba45b8
|
💩 Remove mysterious TEMP directory for now
|
2019-09-26 16:55:54 +02:00 |
|
|
68902f923d
|
📜 Downgrading to PAGE 2018 is not the last step anymore
|
2019-09-26 16:55:02 +02:00 |
|
|
6c0d7e0aee
|
💩 Do not fix PAGE image references for now
|
2019-09-26 16:46:12 +02:00 |
|
|
343a3fbf82
|
🔧 Evaluate both Tesseract and Calamari results
|
2019-08-21 13:07:27 +02:00 |
|
|
0bc06c2fad
|
✨ Run Calamari OCR
|
2019-08-21 11:54:01 +02:00 |
|
|
daed87566e
|
🚑 Don't install typegroups classifier for now
|
2019-08-16 18:23:15 +02:00 |
|
|
d8f3438ac5
|
🚑 Don't check pixel density
|
2019-08-16 18:21:59 +02:00 |
|
|
85ff80d548
|
✨ Use dinglehopper's new OCR-D interface
|
2019-08-16 14:04:41 +02:00 |
|
|
d5aa273b44
|
🚧 Use ocr-eval aka dinglehopper
|
2019-08-13 18:13:49 +02:00 |
|
|
be5750f4e1
|
✨ As a last step, downgrade to PAGE 2018 to support PAGE Viewer
|
2019-08-05 18:46:36 +02:00 |
|
|
cf2b4de2a0
|
🧹 Validate again after fixing image references
|
2019-08-05 17:46:20 +02:00 |
|
|
21e00932be
|
🐛 Use a valid filegrp USE for fontident
|
2019-08-05 17:38:24 +02:00 |
|
|
ade39a278c
|
🎨 Align file groups
|
2019-08-05 17:08:58 +02:00 |
|
|
3fee2d4fe6
|
📌 Use my ocrd_typegroups_classifier fix for passing down the page id
|
2019-08-05 17:00:54 +02:00 |
|
|
44772f1923
|
🚧 Work around problems with ocrd-tesserocr producing TextEquiv/@conf
|
2019-08-05 15:40:39 +02:00 |
|
|
8b67866aac
|
✨ Validate PAGE XML after OCR
|
2019-08-05 15:31:24 +02:00 |
|
|
0d7fd21446
|
✨ Validate workspace after each step
|
2019-08-05 15:27:38 +02:00 |
|
|
de841746e3
|
Use PAGE 2019
|
2019-08-02 11:58:56 +02:00 |
|
|
ff0570e151
|
Use frk for now
|
2019-08-02 11:58:46 +02:00 |
|
|
cc81afa1a5
|
🧹 No need to clean up after tesserocr
|
2019-07-03 13:46:49 +02:00 |
|
|
89a2893e4e
|
❌ I do not care for the multiple mets:agents elements
|
2019-07-03 12:35:15 +02:00 |
|
|
0e63fa1756
|
⁉ PyTessApi seems to use both engine modes
|
2019-07-03 12:30:52 +02:00 |
|
|
e3a1afbc93
|
📝 Document the functions
|
2019-07-03 12:22:55 +02:00 |
|
|
f3e37dd16c
|
Do not hardcode path to typegroups model binary
|
2019-06-24 17:31:25 +02:00 |
|
|
8d66469621
|
Binarize images before segmenting
|
2019-06-24 12:34:08 +02:00 |
|
|
5e1ece4877
|
Use ocrd-tesserocr-segment-*
|
2019-06-24 12:13:49 +02:00 |
|