195354c6d4
Merge branch 'feat/compare-line-texts'
continuous-integration/drone/push Build encountered an error
2022-01-24 18:46:33 +01:00
8a3f5e48c2
🐛 dinglehopper: Patch word_break only once
...
continuous-integration/drone/push Build encountered an error
Previously, we (accidently) patched uniseg's word_break on every call
to words(). Do it only once.
2022-01-24 18:44:30 +01:00
b6bde2b7ec
📝 dinglehopper: Document dinglehopper-line-dirs in the README
continuous-integration/drone/push Build encountered an error
2021-12-15 11:16:40 +01:00
f77ce857b2
🚧 dinglehopper: Sahre json_float code
continuous-integration/drone/push Build encountered an error
2021-12-14 18:37:07 +01:00
5b394649a7
🚧 dinglehopper: Compute WER in line-dirs CLI
2021-12-14 18:33:20 +01:00
cb2be96179
🚧 dinglehopper: Add word differences in line-dirs report
2021-12-14 18:20:04 +01:00
dbb660615a
🚧 dinglehopper: Compare line text directories (WIP)
continuous-integration/drone/push Build encountered an error
2021-12-14 11:37:07 +01:00
a018006f98
🚧 dinglehopper: Compare line text directories (WIP)
2021-12-14 11:37:07 +01:00
36b36f6986
🚧 dinglehopper: Compare line text directories (WIP)
2021-12-14 11:37:07 +01:00
f0f3cd2d96
⬆️ dinglehopper: Require rapidfuzz >= 1.9.1
...
continuous-integration/drone/push Build encountered an error
See https://github.com/qurator-spk/dinglehopper/issues/64 .
2021-12-14 11:36:00 +01:00
a5c9c7438f
💩 ocrd-galley: Work around OCR-D/core#730
...
continuous-integration/drone/push Build is passing
OCR-D/core currently needs six until the next relaase. Fix the build by
requiring it here.
2021-11-05 17:05:54 +01:00
7d26b049d1
Merge branch 'fix/ci-py310'
2021-10-26 13:28:57 +02:00
51a44895dc
⬆️ CircleCI: Add Python 3.10
2021-10-26 13:24:50 +02:00
1f8fa5176f
Revert " ⬆️ CircleCI: Add Python 3.10"
...
This reverts commit b2b21839c2
.
2021-10-23 15:22:57 +02:00
b2b21839c2
⬆️ CircleCI: Add Python 3.10
2021-10-22 18:41:47 +02:00
7d85e21cbc
⬆️ CircleCI: Switch to the new cimg/python image
2021-10-22 18:39:54 +02:00
dea0c53f88
Merge branch 'rapidfuzz'
2021-10-22 18:19:58 +02:00
06ea38449c
📝 dinglehopper: Update Levenshtein notebook
2021-10-22 16:58:40 +02:00
3ee688001a
🧹 dinglehopper: Directly import levenshtein() from rapidfuzz
2021-10-22 16:30:21 +02:00
5d496df267
⚡ dinglehopper: Remove tests that only test rapidfuzz's levenshtein()
2021-10-22 16:26:55 +02:00
091f069b3c
⚡ dinglehopper: Remove tests that only test rapidfuzz's levenshtein_ops()
2021-10-22 16:21:16 +02:00
af8da1d716
⚡ dinglehopper: Use rapidfuzz for editops
2021-10-22 15:38:59 +02:00
249787686f
Merge branch 'master' of github.com:qurator-spk/dinglehopper
continuous-integration/drone/push Build is failing
2021-05-20 09:42:15 +02:00
2a6cc5823e
🐛 dinglehopper: Call initLogging before logging
...
When using ocrd_utils' getLogger(), we need to call initLogging() before doing any
logging.
Fixes #55 .
2021-05-20 09:39:09 +02:00
0b9af3a21e
Merge pull request #58 from kba/unorderedgroupindexed
...
continuous-integration/drone/push Build is passing
ReadingOrder may also contain UnorderedGroupIndexed
2021-05-18 18:32:32 +02:00
Konstantin Baierer
7fde00d911
ReadingOrder may also contain UnorderedGroupIndexed
2021-05-18 17:34:08 +02:00
1778b36a9a
🚧 dinglehopper: Read PAGE UnorderedGroup in XML order
2021-04-15 21:09:45 +02:00
bd324331e6
🚧 dinglehopper: Try out Drone CI
continuous-integration/drone/push Build is passing
2021-02-11 14:26:29 +01:00
a59ecb795c
🚧 dinglehopper: Try out Drone CI
continuous-integration/drone/push Build is failing
2021-02-11 14:15:08 +01:00
14230e073a
🚧 dinglehopper: Try out Drone CI
2021-02-11 14:08:25 +01:00
985666a71c
🚧 dinglehopper: Try out Drone CI
2021-02-10 20:35:22 +01:00
4a73053cfc
🚧 Replace Travis with CircleCI
2021-02-10 18:22:52 +01:00
e3d4493c82
🚧 Replace Travis with CircleCI
2021-02-10 17:58:58 +01:00
27f4c3bdf8
🚧 Replace Travis with CircleCI
2021-02-10 17:57:08 +01:00
8533e6d421
🚧 Replace Travis with CircleCI
2021-02-10 17:55:09 +01:00
e8da8b63f8
🚧 Replace Travis with CircleCI
2021-02-10 17:53:50 +01:00
3b7a1a5631
🚧 Replace Travis with CircleCI
2021-02-10 17:50:34 +01:00
691ce371ca
Merge pull request #50 from b2m/fix-table-extraction
...
Fix the extraction of text from Page with TableRegion
2021-02-01 17:51:33 +01:00
Benjamin Rosemann
a68fc269d9
Fix the extraction of text from Page with TableRegion
...
Dinglehopper did not consider `OrderedGroupIndex` in the `ReadingOrder`
element when extracting text regions. As a consequence a `TableRegion`
was not considered for text extraction.
2020-11-27 11:18:11 +01:00
8cd8314c8a
🐛 dinglehopper: Bump up ocrd req for zip_input_files
...
See also GH-49.
2020-11-19 18:59:47 +01:00
62670dd0c7
Merge pull request #49 from kba/zip_input_files
...
ocrd cli: use core-provided zip_input_files method
2020-11-19 18:54:21 +01:00
Konstantin Baierer
74e0ac18ed
ocrd cli: use core-provided zip_input_files method
2020-11-19 16:00:28 +01:00
389e253c11
🐛 dinglehopper: Fix alto_extract_lines()'s type annotation
2020-11-12 19:32:38 +01:00
fe3923a8af
🐛 dinglehopper: Fix alto_extract()'s type annotation
2020-11-12 19:19:05 +01:00
132f91d500
✔️ dinglehopper: Add missing integration test markers
2020-11-12 19:10:23 +01:00
c48d7646df
📝 dinglehopper: README-DEV: Massage markdown a bit
2020-11-12 19:05:14 +01:00
fed021090d
Merge pull request #46 from b2m/tool-changes
...
Tool changes
2020-11-12 18:59:25 +01:00
Benjamin Rosemann
cb1ac9d260
Add black to developer requirements.
2020-11-11 11:36:17 +01:00
Benjamin Rosemann
03ad413f4a
Added some helpful tools and configurations
2020-11-11 11:36:17 +01:00
Benjamin Rosemann
5cbd4f3d95
Preparation for black code formatter
2020-11-11 11:36:17 +01:00