1
0
Fork 0
mirror of https://github.com/qurator-spk/dinglehopper.git synced 2025-06-09 11:50:00 +02:00
Commit graph

262 commits

Author SHA1 Message Date
8a3f5e48c2 🐛 dinglehopper: Patch word_break only once
Some checks reported errors
continuous-integration/drone/push Build encountered an error
Previously, we (accidently) patched uniseg's word_break on every call
to words(). Do it only once.
2022-01-24 18:44:30 +01:00
b6bde2b7ec 📝 dinglehopper: Document dinglehopper-line-dirs in the README
Some checks reported errors
continuous-integration/drone/push Build encountered an error
2021-12-15 11:16:40 +01:00
f77ce857b2 🚧 dinglehopper: Sahre json_float code
Some checks reported errors
continuous-integration/drone/push Build encountered an error
2021-12-14 18:37:07 +01:00
5b394649a7 🚧 dinglehopper: Compute WER in line-dirs CLI 2021-12-14 18:33:20 +01:00
cb2be96179 🚧 dinglehopper: Add word differences in line-dirs report 2021-12-14 18:20:04 +01:00
dbb660615a 🚧 dinglehopper: Compare line text directories (WIP)
Some checks reported errors
continuous-integration/drone/push Build encountered an error
2021-12-14 11:37:07 +01:00
a018006f98 🚧 dinglehopper: Compare line text directories (WIP) 2021-12-14 11:37:07 +01:00
36b36f6986 🚧 dinglehopper: Compare line text directories (WIP) 2021-12-14 11:37:07 +01:00
f0f3cd2d96 ⬆️ dinglehopper: Require rapidfuzz >= 1.9.1
Some checks reported errors
continuous-integration/drone/push Build encountered an error
See https://github.com/qurator-spk/dinglehopper/issues/64.
2021-12-14 11:36:00 +01:00
a5c9c7438f 💩 ocrd-galley: Work around OCR-D/core#730
All checks were successful
continuous-integration/drone/push Build is passing
OCR-D/core currently needs six until the next relaase. Fix the build by
requiring it here.
2021-11-05 17:05:54 +01:00
7d26b049d1 Merge branch 'fix/ci-py310' 2021-10-26 13:28:57 +02:00
51a44895dc ⬆️ CircleCI: Add Python 3.10 2021-10-26 13:24:50 +02:00
1f8fa5176f Revert "⬆️ CircleCI: Add Python 3.10"
This reverts commit b2b21839c2.
2021-10-23 15:22:57 +02:00
b2b21839c2 ⬆️ CircleCI: Add Python 3.10 2021-10-22 18:41:47 +02:00
7d85e21cbc ⬆️ CircleCI: Switch to the new cimg/python image 2021-10-22 18:39:54 +02:00
dea0c53f88 Merge branch 'rapidfuzz' 2021-10-22 18:19:58 +02:00
06ea38449c 📝 dinglehopper: Update Levenshtein notebook 2021-10-22 16:58:40 +02:00
3ee688001a 🧹 dinglehopper: Directly import levenshtein() from rapidfuzz 2021-10-22 16:30:21 +02:00
5d496df267 dinglehopper: Remove tests that only test rapidfuzz's levenshtein() 2021-10-22 16:26:55 +02:00
091f069b3c dinglehopper: Remove tests that only test rapidfuzz's levenshtein_ops() 2021-10-22 16:21:16 +02:00
af8da1d716 dinglehopper: Use rapidfuzz for editops 2021-10-22 15:38:59 +02:00
249787686f Merge branch 'master' of github.com:qurator-spk/dinglehopper
Some checks failed
continuous-integration/drone/push Build is failing
2021-05-20 09:42:15 +02:00
2a6cc5823e 🐛 dinglehopper: Call initLogging before logging
When using ocrd_utils' getLogger(), we need to call initLogging() before doing any
logging.

Fixes #55.
2021-05-20 09:39:09 +02:00
0b9af3a21e
Merge pull request #58 from kba/unorderedgroupindexed
All checks were successful
continuous-integration/drone/push Build is passing
ReadingOrder may also contain UnorderedGroupIndexed
2021-05-18 18:32:32 +02:00
Konstantin Baierer
7fde00d911 ReadingOrder may also contain UnorderedGroupIndexed 2021-05-18 17:34:08 +02:00
1778b36a9a 🚧 dinglehopper: Read PAGE UnorderedGroup in XML order 2021-04-15 21:09:45 +02:00
bd324331e6 🚧 dinglehopper: Try out Drone CI
All checks were successful
continuous-integration/drone/push Build is passing
2021-02-11 14:26:29 +01:00
a59ecb795c 🚧 dinglehopper: Try out Drone CI
Some checks failed
continuous-integration/drone/push Build is failing
2021-02-11 14:15:08 +01:00
14230e073a 🚧 dinglehopper: Try out Drone CI 2021-02-11 14:08:25 +01:00
985666a71c 🚧 dinglehopper: Try out Drone CI 2021-02-10 20:35:22 +01:00
4a73053cfc 🚧 Replace Travis with CircleCI 2021-02-10 18:22:52 +01:00
e3d4493c82 🚧 Replace Travis with CircleCI 2021-02-10 17:58:58 +01:00
27f4c3bdf8 🚧 Replace Travis with CircleCI 2021-02-10 17:57:08 +01:00
8533e6d421 🚧 Replace Travis with CircleCI 2021-02-10 17:55:09 +01:00
e8da8b63f8 🚧 Replace Travis with CircleCI 2021-02-10 17:53:50 +01:00
3b7a1a5631 🚧 Replace Travis with CircleCI 2021-02-10 17:50:34 +01:00
691ce371ca
Merge pull request #50 from b2m/fix-table-extraction
Fix the extraction of text from Page with TableRegion
2021-02-01 17:51:33 +01:00
Benjamin Rosemann
a68fc269d9 Fix the extraction of text from Page with TableRegion
Dinglehopper did not consider `OrderedGroupIndex` in the `ReadingOrder`
element when extracting text regions. As a consequence a `TableRegion`
was not considered for text extraction.
2020-11-27 11:18:11 +01:00
8cd8314c8a 🐛 dinglehopper: Bump up ocrd req for zip_input_files
See also GH-49.
2020-11-19 18:59:47 +01:00
62670dd0c7
Merge pull request #49 from kba/zip_input_files
ocrd cli: use core-provided zip_input_files method
2020-11-19 18:54:21 +01:00
Konstantin Baierer
74e0ac18ed ocrd cli: use core-provided zip_input_files method 2020-11-19 16:00:28 +01:00
389e253c11 🐛 dinglehopper: Fix alto_extract_lines()'s type annotation 2020-11-12 19:32:38 +01:00
fe3923a8af 🐛 dinglehopper: Fix alto_extract()'s type annotation 2020-11-12 19:19:05 +01:00
132f91d500 ✔️ dinglehopper: Add missing integration test markers 2020-11-12 19:10:23 +01:00
c48d7646df 📝 dinglehopper: README-DEV: Massage markdown a bit 2020-11-12 19:05:14 +01:00
fed021090d
Merge pull request #46 from b2m/tool-changes
Tool changes
2020-11-12 18:59:25 +01:00
Benjamin Rosemann
cb1ac9d260 Add black to developer requirements. 2020-11-11 11:36:17 +01:00
Benjamin Rosemann
03ad413f4a Added some helpful tools and configurations 2020-11-11 11:36:17 +01:00
Benjamin Rosemann
5cbd4f3d95 Preparation for black code formatter 2020-11-11 11:36:17 +01:00
Benjamin Rosemann
ce752e1912 Remove .idea folder and modify .gitignore
Sharing even parts of the .idea folder in worldwide setting is bound to
generate more problems than solutions. Therefore it should be removed
and consequently ignore in .gitignore.

Also adds some Python specific stuff to the .gitignore file.
2020-11-11 11:36:17 +01:00