Mike Gerber
618ea567de
🐛 Fix docstring of distance() for grapheme clusters
1 year ago
Mike Gerber
e256526ea1
🐛 Fix calculation of score_hint for edge cases, e.g. when CER is infinite
...
If the CER is infinite, we can't calculate a score_hint as an int. Fall back to None
in this case.
1 year ago
Mike Gerber
bc95c03127
🕸Do not use deprecated ID, pageId options
...
See gh-75.
1 year ago
Mike Gerber
7fef02bf0a
✔ Add mets:FLocat's @LOCTYPE/OTHERLOCTYPE to test data
...
Newest OCR-D wasn't happy with the test data anymore (see gh-89). I'm not sure if the
test data was invalid the way it was, but having a LOCTYPE certainly is "prettier" so
adding it. This fixes the test again.
1 year ago
Mike Gerber
7ed076d3c1
⬆ Update multimethod dependency
...
We had some issues while reviewing/rebasing #72 . We don't support Python 3.5 anymore,
so lifting the hard pin on multimethod 1.3.
1 year ago
Gerber, Mike
a18b25b163
🐛 Update tests for ExtractedText
...
In PR gh-72, @maxbachmann introduced a new argument for ExtractedText(). Update the
corresponding tests.
2 years ago
Max Bachmann
f48e305347
use uniseg again
2 years ago
Max Bachmann
d2bbc8a6c7
update rapidfuzz version
2 years ago
Max Bachmann
a1f0a5e2d3
replace uniseg with uniseg2
2 years ago
Max Bachmann
22c3817f45
apply black
2 years ago
Max Bachmann
01571f23b7
move grapheme clusters to ExtractedText
2 years ago
Max Bachmann
f211d09f56
remove python2.7 futures
2 years ago
Max Bachmann
205a969c0e
remove unused includes
2 years ago
Max Bachmann
f3825cdeb6
only call `words_normalized` once
2 years ago
Gerber, Mike
dcc10c5389
✔️ Skip test_lines_similar() for now
...
test_lines_similar() fails with rapidfuzz 2.5 and is flawed anyway:
The test was based on our own implementation that used __eq__ and not __hash__ as
rapidfuzz does. Need to review this in the future.
2 years ago
Gerber, Mike
555f586775
📝 Note that old terminals might not render the Unicode characters correctly
2 years ago
Gerber, Mike
c4e85da5ab
🐛 Update editops() and seq_align() due to RapidFuzz API changes
2 years ago
Gerber, Mike
15dfbac3a7
Revert "Revert "Merge pull request #67 from maxbachmann/rapidfuzz""
...
This reverts commit 76bd50f1db
.
2 years ago
Gerber, Mike
ede9402a6c
Revert " 💩 Stick with rapidfuzz < 2.1.0 for now"
...
This reverts commit 0e153db9ca
.
2 years ago
Gerber, Mike
0e153db9ca
💩 Stick with rapidfuzz < 2.1.0 for now
2 years ago
Gerber, Mike
76bd50f1db
Revert "Merge pull request #67 from maxbachmann/rapidfuzz"
...
This reverts commit 85f751aacc
, reversing
changes made to 1febea8c92
.
2 years ago
Mike Gerber
85f751aacc
Merge pull request #67 from maxbachmann/rapidfuzz
...
replace usage of deprecated rapidfuzz APIs
2 years ago
Max Bachmann
e543438496
replace usage of deprecated rapidfuzz APIs
2 years ago
Mike Gerber
1febea8c92
Merge pull request #66 from stweil/master
...
continuous-integration/drone/push Build is passing
Details
Ignore Python build artifacts
3 years ago
Stefan Weil
101f50ec88
Ignore Python build artifacts
...
Signed-off-by: Stefan Weil <sw@weilnetz.de>
3 years ago
Gerber, Mike
edc24cd4db
✔️ DroneCI: Build on Python 3.6 → 3.10
continuous-integration/drone/push Build is passing
Details
3 years ago
Gerber, Mike
d726396002
👷🏾♂️ Remove str() on Path objects
...
As of Python 3.6 we don't need to call str() on Path objects anymore.
See also gh-20.
3 years ago
Gerber, Mike
a19224dc46
✔️ CircleCI: Stop testing using Python 3.5
...
The latest rapidfuzz updates broke Python 3.5 support. As it is EOL for some time now,
we are stopping testing with it.
See also gh-65 and gh-20.
3 years ago
Gerber, Mike
76bacc0f15
🐛 Bump rapidfuzz dep to >= 2.0.5 (Fixes gh-65)
3 years ago
Gerber, Mike
195354c6d4
Merge branch 'feat/compare-line-texts'
continuous-integration/drone/push Build encountered an error
Details
3 years ago
Gerber, Mike
8a3f5e48c2
🐛 dinglehopper: Patch word_break only once
...
continuous-integration/drone/push Build encountered an error
Details
Previously, we (accidently) patched uniseg's word_break on every call
to words(). Do it only once.
3 years ago
Gerber, Mike
b6bde2b7ec
📝 dinglehopper: Document dinglehopper-line-dirs in the README
continuous-integration/drone/push Build encountered an error
Details
3 years ago
Gerber, Mike
f77ce857b2
🚧 dinglehopper: Sahre json_float code
continuous-integration/drone/push Build encountered an error
Details
3 years ago
Gerber, Mike
5b394649a7
🚧 dinglehopper: Compute WER in line-dirs CLI
3 years ago
Gerber, Mike
cb2be96179
🚧 dinglehopper: Add word differences in line-dirs report
3 years ago
Gerber, Mike
dbb660615a
🚧 dinglehopper: Compare line text directories (WIP)
continuous-integration/drone/push Build encountered an error
Details
3 years ago
Gerber, Mike
a018006f98
🚧 dinglehopper: Compare line text directories (WIP)
3 years ago
Gerber, Mike
36b36f6986
🚧 dinglehopper: Compare line text directories (WIP)
3 years ago
Gerber, Mike
f0f3cd2d96
⬆️ dinglehopper: Require rapidfuzz >= 1.9.1
...
continuous-integration/drone/push Build encountered an error
Details
See https://github.com/qurator-spk/dinglehopper/issues/64 .
3 years ago
Gerber, Mike
a5c9c7438f
💩 ocrd-galley: Work around OCR-D/core#730
...
continuous-integration/drone/push Build is passing
Details
OCR-D/core currently needs six until the next relaase. Fix the build by
requiring it here.
3 years ago
Gerber, Mike
7d26b049d1
Merge branch 'fix/ci-py310'
3 years ago
Gerber, Mike
51a44895dc
⬆️ CircleCI: Add Python 3.10
3 years ago
Gerber, Mike
1f8fa5176f
Revert " ⬆️ CircleCI: Add Python 3.10"
...
This reverts commit b2b21839c2
.
3 years ago
Gerber, Mike
b2b21839c2
⬆️ CircleCI: Add Python 3.10
3 years ago
Gerber, Mike
7d85e21cbc
⬆️ CircleCI: Switch to the new cimg/python image
3 years ago
Gerber, Mike
dea0c53f88
Merge branch 'rapidfuzz'
3 years ago
Gerber, Mike
06ea38449c
📝 dinglehopper: Update Levenshtein notebook
3 years ago
Gerber, Mike
3ee688001a
🧹 dinglehopper: Directly import levenshtein() from rapidfuzz
3 years ago
Gerber, Mike
5d496df267
⚡ dinglehopper: Remove tests that only test rapidfuzz's levenshtein()
3 years ago
Gerber, Mike
091f069b3c
⚡ dinglehopper: Remove tests that only test rapidfuzz's levenshtein_ops()
3 years ago