Max Bachmann
01571f23b7
move grapheme clusters to ExtractedText
2022-08-29 01:49:04 +02:00
Max Bachmann
f211d09f56
remove python2.7 futures
2022-08-29 00:50:33 +02:00
Max Bachmann
205a969c0e
remove unused includes
2022-08-29 00:48:40 +02:00
Max Bachmann
f3825cdeb6
only call words_normalized
once
2022-08-29 00:22:23 +02:00
dcc10c5389
✔️ Skip test_lines_similar() for now
...
test_lines_similar() fails with rapidfuzz 2.5 and is flawed anyway:
The test was based on our own implementation that used __eq__ and not __hash__ as
rapidfuzz does. Need to review this in the future.
2022-08-18 15:51:16 +02:00
555f586775
📝 Note that old terminals might not render the Unicode characters correctly
2022-08-17 17:59:15 +02:00
c4e85da5ab
🐛 Update editops() and seq_align() due to RapidFuzz API changes
2022-08-17 17:55:44 +02:00
15dfbac3a7
Revert "Revert "Merge pull request #67 from maxbachmann/rapidfuzz""
...
This reverts commit 76bd50f1db
.
2022-08-17 11:42:19 +02:00
ede9402a6c
Revert " 💩 Stick with rapidfuzz < 2.1.0 for now"
...
This reverts commit 0e153db9ca
.
2022-08-17 11:42:07 +02:00
0e153db9ca
💩 Stick with rapidfuzz < 2.1.0 for now
2022-08-16 19:34:48 +02:00
76bd50f1db
Revert "Merge pull request #67 from maxbachmann/rapidfuzz"
...
This reverts commit 85f751aacc
, reversing
changes made to 1febea8c92
.
2022-08-16 19:31:28 +02:00
85f751aacc
Merge pull request #67 from maxbachmann/rapidfuzz
...
replace usage of deprecated rapidfuzz APIs
2022-08-16 16:35:54 +02:00
Max Bachmann
e543438496
replace usage of deprecated rapidfuzz APIs
2022-08-07 10:40:31 +02:00
1febea8c92
Merge pull request #66 from stweil/master
...
continuous-integration/drone/push Build is passing
Ignore Python build artifacts
2022-03-30 13:40:36 +02:00
Stefan Weil
101f50ec88
Ignore Python build artifacts
...
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2022-03-24 16:51:37 +01:00
edc24cd4db
✔️ DroneCI: Build on Python 3.6 → 3.10
continuous-integration/drone/push Build is passing
2022-03-03 16:35:26 +01:00
d726396002
👷🏾♂️ Remove str() on Path objects
...
As of Python 3.6 we don't need to call str() on Path objects anymore.
See also gh-20.
2022-03-02 11:19:40 +01:00
a19224dc46
✔️ CircleCI: Stop testing using Python 3.5
...
The latest rapidfuzz updates broke Python 3.5 support. As it is EOL for some time now,
we are stopping testing with it.
See also gh-65 and gh-20.
2022-02-28 14:46:34 +01:00
76bacc0f15
🐛 Bump rapidfuzz dep to >= 2.0.5 (Fixes gh-65)
2022-02-28 14:35:54 +01:00
195354c6d4
Merge branch 'feat/compare-line-texts'
continuous-integration/drone/push Build encountered an error
2022-01-24 18:46:33 +01:00
8a3f5e48c2
🐛 dinglehopper: Patch word_break only once
...
continuous-integration/drone/push Build encountered an error
Previously, we (accidently) patched uniseg's word_break on every call
to words(). Do it only once.
2022-01-24 18:44:30 +01:00
b6bde2b7ec
📝 dinglehopper: Document dinglehopper-line-dirs in the README
continuous-integration/drone/push Build encountered an error
2021-12-15 11:16:40 +01:00
f77ce857b2
🚧 dinglehopper: Sahre json_float code
continuous-integration/drone/push Build encountered an error
2021-12-14 18:37:07 +01:00
5b394649a7
🚧 dinglehopper: Compute WER in line-dirs CLI
2021-12-14 18:33:20 +01:00
cb2be96179
🚧 dinglehopper: Add word differences in line-dirs report
2021-12-14 18:20:04 +01:00
dbb660615a
🚧 dinglehopper: Compare line text directories (WIP)
continuous-integration/drone/push Build encountered an error
2021-12-14 11:37:07 +01:00
a018006f98
🚧 dinglehopper: Compare line text directories (WIP)
2021-12-14 11:37:07 +01:00
36b36f6986
🚧 dinglehopper: Compare line text directories (WIP)
2021-12-14 11:37:07 +01:00
f0f3cd2d96
⬆️ dinglehopper: Require rapidfuzz >= 1.9.1
...
continuous-integration/drone/push Build encountered an error
See https://github.com/qurator-spk/dinglehopper/issues/64 .
2021-12-14 11:36:00 +01:00
a5c9c7438f
💩 ocrd-galley: Work around OCR-D/core#730
...
continuous-integration/drone/push Build is passing
OCR-D/core currently needs six until the next relaase. Fix the build by
requiring it here.
2021-11-05 17:05:54 +01:00
7d26b049d1
Merge branch 'fix/ci-py310'
2021-10-26 13:28:57 +02:00
51a44895dc
⬆️ CircleCI: Add Python 3.10
2021-10-26 13:24:50 +02:00
1f8fa5176f
Revert " ⬆️ CircleCI: Add Python 3.10"
...
This reverts commit b2b21839c2
.
2021-10-23 15:22:57 +02:00
b2b21839c2
⬆️ CircleCI: Add Python 3.10
2021-10-22 18:41:47 +02:00
7d85e21cbc
⬆️ CircleCI: Switch to the new cimg/python image
2021-10-22 18:39:54 +02:00
dea0c53f88
Merge branch 'rapidfuzz'
2021-10-22 18:19:58 +02:00
06ea38449c
📝 dinglehopper: Update Levenshtein notebook
2021-10-22 16:58:40 +02:00
3ee688001a
🧹 dinglehopper: Directly import levenshtein() from rapidfuzz
2021-10-22 16:30:21 +02:00
5d496df267
⚡ dinglehopper: Remove tests that only test rapidfuzz's levenshtein()
2021-10-22 16:26:55 +02:00
091f069b3c
⚡ dinglehopper: Remove tests that only test rapidfuzz's levenshtein_ops()
2021-10-22 16:21:16 +02:00
af8da1d716
⚡ dinglehopper: Use rapidfuzz for editops
2021-10-22 15:38:59 +02:00
249787686f
Merge branch 'master' of github.com:qurator-spk/dinglehopper
continuous-integration/drone/push Build is failing
2021-05-20 09:42:15 +02:00
2a6cc5823e
🐛 dinglehopper: Call initLogging before logging
...
When using ocrd_utils' getLogger(), we need to call initLogging() before doing any
logging.
Fixes #55 .
2021-05-20 09:39:09 +02:00
0b9af3a21e
Merge pull request #58 from kba/unorderedgroupindexed
...
continuous-integration/drone/push Build is passing
ReadingOrder may also contain UnorderedGroupIndexed
2021-05-18 18:32:32 +02:00
Konstantin Baierer
7fde00d911
ReadingOrder may also contain UnorderedGroupIndexed
2021-05-18 17:34:08 +02:00
1778b36a9a
🚧 dinglehopper: Read PAGE UnorderedGroup in XML order
2021-04-15 21:09:45 +02:00
bd324331e6
🚧 dinglehopper: Try out Drone CI
continuous-integration/drone/push Build is passing
2021-02-11 14:26:29 +01:00
a59ecb795c
🚧 dinglehopper: Try out Drone CI
continuous-integration/drone/push Build is failing
2021-02-11 14:15:08 +01:00
14230e073a
🚧 dinglehopper: Try out Drone CI
2021-02-11 14:08:25 +01:00
985666a71c
🚧 dinglehopper: Try out Drone CI
2021-02-10 20:35:22 +01:00