Commit Graph

405 Commits (4466422cda9fa655c74c1f21ef434453e18f8cf9)
 

Author SHA1 Message Date
Mike Gerber fc81233a0e 🚧 CircleCI: Run black 1 year ago
Mike Gerber cb0134d2db 🚧 CircleCI: Run black 1 year ago
Mike Gerber 55d534b981 🚧 CircleCI: Run black 1 year ago
Mike Gerber 2632cb09b8 🚧 CircleCI: Run black 1 year ago
Mike Gerber 35be58cb94
Merge pull request #83 from INL/feat/batch-processing
Add batch processing and report summaries
1 year ago
Mike Gerber 6d3a8cecd2
Merge pull request #82 from CircleCI-config-suggestions-bot/StoreTestResults
Update .circleci/config.yml to use store_test_results
1 year ago
Ruud de Jong 207804e6a6 Add batch processing and report summaries 2 years ago
CircleCI Config Suggestions Bot 89814cbe4b Upload test results to CircleCI 2 years ago
neingeist dd9303b429 🧹 .gitignore .python-version (for pyenv) 2 years ago
Mike Gerber f1fc3f1880 🧹 Remove qurator. namespace prefix 2 years ago
Mike Gerber f668963a2e 🐛 Fix installing by calling find_namespace_packages in setup.py
Turns out just removing __init__.py is not enough for native namespace
packages. We also need to (explicitly) call setuptools.find_namespace_packages()
for setup.py to find the package...

https://packaging.python.org/en/latest/guides/packaging-namespace-packages/#native-namespace-packages

Fixes gh-77.
2 years ago
Mike Gerber c4ab7c9a7c 🕸Do not use deprecated ID, pageId options
See gh-75.
2 years ago
Mike Gerber b4ac24ac9d 🔧 Remove explicit namespace_packages
Fixes gh-76.
2 years ago
Mike Gerber 2a090c9b5a ✔ CircleCI: Explicitly install binary opencv-python-headless (dep of OCR-D?) to avoid compilation 2 years ago
Mike Gerber 833efa37da 🐛 Remove deprecated declare_namespace call
Remove depecreated declare_namespace call and use implicit namespace (PEP-0420).

Fixes gh-76.
2 years ago
Gerber, Mike 0fd4ea1973 ✔ Add @cneud's former 40 GB problem files to the test suite 2 years ago
Gerber, Mike 0f0819512e 🎨 Reformat using Black 2 years ago
Gerber, Mike 2268f32a78 ✔ CircleCI: Test on Python 3.11 2 years ago
Gerber, Mike a18b25b163 🐛 Update tests for ExtractedText
In PR gh-72, @maxbachmann introduced a new argument for ExtractedText(). Update the
corresponding tests.
2 years ago
Max Bachmann f48e305347
use uniseg again 2 years ago
Max Bachmann d2bbc8a6c7 update rapidfuzz version 2 years ago
Max Bachmann a1f0a5e2d3 replace uniseg with uniseg2 2 years ago
Max Bachmann 22c3817f45 apply black 2 years ago
Max Bachmann 01571f23b7 move grapheme clusters to ExtractedText 2 years ago
Max Bachmann f211d09f56 remove python2.7 futures 2 years ago
Max Bachmann 205a969c0e remove unused includes 2 years ago
Max Bachmann f3825cdeb6
only call `words_normalized` once 2 years ago
Gerber, Mike dcc10c5389 ✔️ Skip test_lines_similar() for now
test_lines_similar() fails with rapidfuzz 2.5 and is flawed anyway:

The test was based on our own implementation that used __eq__ and not __hash__ as
rapidfuzz does. Need to review this in the future.
2 years ago
Gerber, Mike 555f586775 📝 Note that old terminals might not render the Unicode characters correctly 2 years ago
Gerber, Mike c4e85da5ab 🐛 Update editops() and seq_align() due to RapidFuzz API changes 2 years ago
Gerber, Mike 15dfbac3a7 Revert "Revert "Merge pull request #67 from maxbachmann/rapidfuzz""
This reverts commit 76bd50f1db.
2 years ago
Gerber, Mike ede9402a6c Revert "💩 Stick with rapidfuzz < 2.1.0 for now"
This reverts commit 0e153db9ca.
2 years ago
Gerber, Mike 0e153db9ca 💩 Stick with rapidfuzz < 2.1.0 for now 2 years ago
Gerber, Mike 76bd50f1db Revert "Merge pull request #67 from maxbachmann/rapidfuzz"
This reverts commit 85f751aacc, reversing
changes made to 1febea8c92.
2 years ago
Mike Gerber 85f751aacc
Merge pull request #67 from maxbachmann/rapidfuzz
replace usage of deprecated rapidfuzz APIs
2 years ago
Max Bachmann e543438496 replace usage of deprecated rapidfuzz APIs 2 years ago
Mike Gerber 1febea8c92
Merge pull request #66 from stweil/master
continuous-integration/drone/push Build is passing Details
Ignore Python build artifacts
3 years ago
Stefan Weil 101f50ec88 Ignore Python build artifacts
Signed-off-by: Stefan Weil <sw@weilnetz.de>
3 years ago
Gerber, Mike edc24cd4db ✔️ DroneCI: Build on Python 3.6 → 3.10
continuous-integration/drone/push Build is passing Details
3 years ago
Gerber, Mike d726396002 👷🏾‍♂️ Remove str() on Path objects
As of Python 3.6 we don't need to call str() on Path objects anymore.

See also gh-20.
3 years ago
Gerber, Mike a19224dc46 ✔️ CircleCI: Stop testing using Python 3.5
The latest rapidfuzz updates broke Python 3.5 support. As it is EOL for some time now,
we are stopping testing with it.

See also gh-65 and gh-20.
3 years ago
Gerber, Mike 76bacc0f15 🐛 Bump rapidfuzz dep to >= 2.0.5 (Fixes gh-65) 3 years ago
Gerber, Mike 195354c6d4 Merge branch 'feat/compare-line-texts'
continuous-integration/drone/push Build encountered an error Details
3 years ago
Gerber, Mike 8a3f5e48c2 🐛 dinglehopper: Patch word_break only once
continuous-integration/drone/push Build encountered an error Details
Previously, we (accidently) patched uniseg's word_break on every call
to words(). Do it only once.
3 years ago
Gerber, Mike b6bde2b7ec 📝 dinglehopper: Document dinglehopper-line-dirs in the README
continuous-integration/drone/push Build encountered an error Details
3 years ago
Gerber, Mike f77ce857b2 🚧 dinglehopper: Sahre json_float code
continuous-integration/drone/push Build encountered an error Details
3 years ago
Gerber, Mike 5b394649a7 🚧 dinglehopper: Compute WER in line-dirs CLI 3 years ago
Gerber, Mike cb2be96179 🚧 dinglehopper: Add word differences in line-dirs report 3 years ago
Gerber, Mike dbb660615a 🚧 dinglehopper: Compare line text directories (WIP)
continuous-integration/drone/push Build encountered an error Details
3 years ago
Gerber, Mike a018006f98 🚧 dinglehopper: Compare line text directories (WIP) 3 years ago