1
0
Fork 0
mirror of https://github.com/qurator-spk/dinglehopper.git synced 2025-06-10 20:29:57 +02:00
Commit graph

534 commits

Author SHA1 Message Date
9fc8937324 ✒ README: Mention dinglehopper-line-dirs --help 2025-04-24 15:13:19 +02:00
14a4bc56d8 🐛 Add --plain-encoding option to dinglehopper-extract 2025-04-22 18:24:35 +02:00
a70260c10e 🐛 Use warning() to fix DeprecationWarning 2025-04-22 13:57:19 +02:00
224aa02163 🚧 Fix help text 2025-04-22 13:57:19 +02:00
9db5b4caf5 🚧 Add OCR-D parameter for plain text encoding 2025-04-22 13:57:19 +02:00
5578ce83a3 🚧 Add option for text encoding to line dir cli 2025-04-22 13:57:19 +02:00
cf59b951a3 🚧 Add option for text encoding to line dir cli 2025-04-22 13:57:19 +02:00
480b3cf864 ✔ Test that CLI produces a complete HTML report 2025-04-22 13:57:19 +02:00
f1a586cff1 ✔ Test line dirs CLI 2025-04-22 13:57:18 +02:00
3b16c14c16 ✔ Properly test line dir finding 2025-04-22 13:57:18 +02:00
322faeb26c 🎨 Sort imports 2025-04-22 13:57:18 +02:00
c37316da09 🐛 cli_line_dirs: Fix word differences section
At the time of generation of the section, the {gt,ocr}_words generators
were drained. Fix by using a list.

Fixes gh-124.
2025-04-22 13:57:18 +02:00
9414a92f9f 🐛 cli_line_dirs: Type-annotate functions 2025-04-22 13:57:18 +02:00
68344e48f8 🎨 Reformat cli_line_dirs 2025-04-22 13:57:18 +02:00
73ee16fe51 🚧 Support 'merged' GT+OCR line directories 2025-04-22 13:57:18 +02:00
6980d7a252 🚧 Use our own removesuffix() as we still support Python 3.8 2025-04-22 13:57:18 +02:00
2bf2529c38 🚧 Port new line dir functions 2025-04-22 13:57:17 +02:00
ad8e6de36b 🐛 cli_line_dirs: Fix character diff reports 2025-04-22 13:57:17 +02:00
4024e350f7 🚧 Test new flexible line dirs functions 2025-04-22 13:57:17 +02:00
3c317cbeaf
Merge pull request #141 from qurator-spk/chore/update-pre-commit
⚙  pre-commit: update
2025-04-22 12:35:14 +02:00
d8403421fc ⚙ pre-commit: update 2025-04-22 12:30:47 +02:00
3305043234
Merge pull request #140 from qurator-spk/fix/vendor-strings
🐛 Fix vendor strings
2025-04-22 11:50:29 +02:00
6bf5bd7178 🐛 Fix vendor strings 2025-04-22 11:48:44 +02:00
817e0c95f7 📦 v0.10.1 2025-04-22 10:32:29 +02:00
3d7c7ee1e3
Merge pull request #139 from bertsky/allow-uniseg-py38
re-allow uniseg 0.8 and py38
2025-04-22 10:09:51 +02:00
Robert Sachunsky
a24623b966 re-allow py38 2025-04-17 16:47:13 +02:00
Robert Sachunsky
ea33602336 CI: reactivate py38 2025-04-17 16:12:42 +02:00
Robert Sachunsky
64444dd419 opt out of 7f8a8dd5 (uniseg update that requires py39) 2025-04-17 16:12:37 +02:00
f6dfb77f94 🐛 pyproject.toml: Fix description 2025-04-17 08:51:32 +02:00
ef817cb343 📦 v0.10.0 2025-04-17 08:37:37 +02:00
b1c109baae
Merge pull request #128 from kba/v3-api
V3 api
2025-04-17 08:34:51 +02:00
13ab1ae150 🐛 Docker: Use same vendor as license for now 2025-04-17 08:26:36 +02:00
d974369e13 🐛 Docker: Fix description 2025-04-17 08:10:56 +02:00
b7bdca4ac8 🐛 Makefile: Make phony targets .PHONY 2025-04-17 08:09:06 +02:00
kba
831a24fc4c typo: report_prefix -> file_id 2025-04-17 08:04:52 +02:00
Konstantin Baierer
f6a2c94520 ocrd_cli: but do check for existing output files
Co-authored-by: Robert Sachunsky <38561704+bertsky@users.noreply.github.com>
2025-04-17 08:04:52 +02:00
Konstantin Baierer
4162836612 ocrd_cli: no need to check fileGrp dir exists
Co-authored-by: Robert Sachunsky <38561704+bertsky@users.noreply.github.com>
2025-04-17 08:04:52 +02:00
Konstantin Baierer
c0aa82d188 OCR-D processor: properly handle missing or non-downloaded GT/OCR file
Co-authored-by: Robert Sachunsky <38561704+bertsky@users.noreply.github.com>
2025-04-17 08:04:51 +02:00
kba
8c1b6d65f5 Dockerfile: build ocrd-all-tool.json 2025-04-17 08:04:51 +02:00
f287386c0e 🧹Don't pin uniseg and rapidfuzz
Breakage with the newest uniseg API was fixed in master.

Can't see any issue with rapidfuzz, so removing that pin, too.
2025-04-16 14:49:23 +02:00
kba
63031b30bf Port to OCR-D/core API v3 2025-04-16 14:45:16 +02:00
bf6633be02
Merge pull request #136 from qurator-spk/chore/update-liccheck
⚙  liccheck: update permissable licenses (mit-cmu, psf 2.0, iscl)
2025-04-16 11:13:02 +02:00
d3aa9eb520 ⚙ liccheck: update permissable licenses (mit-cmu, psf 2.0, iscl) 2025-04-16 11:09:33 +02:00
625686f204
Merge pull request #135 from qurator-spk/chore/update-python-version
⚙  pyproject.toml: Update supported Python version
2025-04-16 11:01:09 +02:00
ce7886af23 ⚙ pyproject.toml: Update supported Python version 2025-04-16 10:57:10 +02:00
a09a624bde
Merge pull request #132 from qurator-spk/fix/uniseg-removed-index-parameter
🐛 Fix for changed API of uniseg's word_break
2025-04-16 09:28:31 +02:00
badfa9c99e ⚙ GitHub Actions: Don't test on Python 3.8 anymore 2025-04-16 09:25:44 +02:00
7f8a8dd564 🐛 Fix for changed API of uniseg's word_break 2025-04-16 09:10:43 +02:00
b72d4f5af9
Merge pull request #131 from qurator-spk/chore/update-pre-commit
⚙  pre-commit: update
2025-04-16 09:06:05 +02:00
058042accb ⚙ pre-commit: update 2025-04-16 08:59:58 +02:00