-
69325facf2
🐛 Detect encoding (incl BOM) when reading files
Mike Gerber
2023-08-03 17:48:13 +0200
-
325e5af5f5
🐛 Move source into src/ to fix install
Mike Gerber
2023-08-03 17:29:28 +0200
-
db7c051b22
⚙ Migrate to pyproject.toml
Mike Gerber
2023-08-02 20:55:47 +0200
-
fc81233a0e
🚧 CircleCI: Run black
Mike Gerber
2023-07-18 20:41:16 +0200
-
cb0134d2db
🚧 CircleCI: Run black
Mike Gerber
2023-07-18 20:40:17 +0200
-
55d534b981
🚧 CircleCI: Run black
Mike Gerber
2023-07-18 20:37:47 +0200
-
2632cb09b8
🚧 CircleCI: Run black
Mike Gerber
2023-07-18 20:28:55 +0200
-
35be58cb94
Merge pull request #83 from INL/feat/batch-processing
Mike Gerber
2023-05-26 15:28:36 +0200
-
-
6d3a8cecd2
Merge pull request #82 from CircleCI-config-suggestions-bot/StoreTestResults
Mike Gerber
2023-05-24 18:50:40 +0200
-
-
207804e6a6
Add batch processing and report summaries
Ruud de Jong
2023-05-12 09:55:00 +0200
-
-
-
89814cbe4b
Upload test results to CircleCI
CircleCI Config Suggestions Bot
2023-05-05 14:21:14 -0400
-
-
dd9303b429
🧹 .gitignore .python-version (for pyenv)
neingeist
2023-04-20 20:15:44 +0200
-
f1fc3f1880
🧹 Remove qurator. namespace prefix
Mike Gerber
2023-03-27 18:25:39 +0200
-
f668963a2e
🐛 Fix installing by calling find_namespace_packages in setup.py
Mike Gerber
2023-03-27 14:34:52 +0200
-
c4ab7c9a7c
🕸Do not use deprecated ID, pageId options
Mike Gerber
2023-03-14 13:16:09 +0100
-
b4ac24ac9d
🔧 Remove explicit namespace_packages
Mike Gerber
2023-03-14 12:59:10 +0100
-
2a090c9b5a
✔ CircleCI: Explicitly install binary opencv-python-headless (dep of OCR-D?) to avoid compilation
Mike Gerber
2023-03-14 12:49:02 +0100
-
833efa37da
🐛 Remove deprecated declare_namespace call
Mike Gerber
2023-03-14 12:44:22 +0100
-
0fd4ea1973
✔ Add @cneud's former 40 GB problem files to the test suite
Gerber, Mike
2023-03-02 16:24:08 +0100
-
0f0819512e
🎨 Reformat using Black
Gerber, Mike
2023-03-02 10:22:51 +0100
-
2268f32a78
✔ CircleCI: Test on Python 3.11
Gerber, Mike
2023-03-02 10:06:00 +0100
-
d07bd5ecc6
add version to ocrd-tool.json (and setup.py)
Konstantin Baierer
2023-02-28 17:14:04 +0100
-
-
a18b25b163
🐛 Update tests for ExtractedText
Gerber, Mike
2023-01-27 19:13:45 +0100
-
f48e305347
use uniseg again
Max Bachmann
2022-10-12 18:52:58 +0200
-
d2bbc8a6c7
update rapidfuzz version
Max Bachmann
2022-09-11 02:38:32 +0200
-
a1f0a5e2d3
replace uniseg with uniseg2
Max Bachmann
2022-08-29 22:08:25 +0200
-
22c3817f45
apply black
Max Bachmann
2022-08-29 01:50:19 +0200
-
01571f23b7
move grapheme clusters to ExtractedText
Max Bachmann
2022-08-29 01:49:04 +0200
-
f211d09f56
remove python2.7 futures
Max Bachmann
2022-08-29 00:50:33 +0200
-
205a969c0e
remove unused includes
Max Bachmann
2022-08-29 00:48:40 +0200
-
f3825cdeb6
only call `words_normalized` once
Max Bachmann
2022-08-29 00:22:23 +0200
-
-
dcc10c5389
✔️ Skip test_lines_similar() for now
Gerber, Mike
2022-08-18 15:51:13 +0200
-
555f586775
📝 Note that old terminals might not render the Unicode characters correctly
Gerber, Mike
2022-08-17 17:59:15 +0200
-
c4e85da5ab
🐛 Update editops() and seq_align() due to RapidFuzz API changes
Gerber, Mike
2022-08-17 17:55:44 +0200
-
15dfbac3a7
Revert "Revert "Merge pull request #67 from maxbachmann/rapidfuzz""
Gerber, Mike
2022-08-17 11:42:19 +0200
-
ede9402a6c
Revert "💩 Stick with rapidfuzz < 2.1.0 for now"
Gerber, Mike
2022-08-17 11:42:07 +0200
-
0e153db9ca
💩 Stick with rapidfuzz < 2.1.0 for now
Gerber, Mike
2022-08-16 19:34:48 +0200
-
76bd50f1db
Revert "Merge pull request #67 from maxbachmann/rapidfuzz"
Gerber, Mike
2022-08-16 19:31:28 +0200
-
85f751aacc
Merge pull request #67 from maxbachmann/rapidfuzz
Mike Gerber
2022-08-16 16:35:54 +0200
-
-
e543438496
replace usage of deprecated rapidfuzz APIs
Max Bachmann
2022-08-07 10:40:31 +0200
-
-
1febea8c92
Merge pull request #66 from stweil/master
Mike Gerber
2022-03-30 13:40:36 +0200
-
-
101f50ec88
Ignore Python build artifacts
Stefan Weil
2022-03-24 16:51:37 +0100
-
-
edc24cd4db
✔️ DroneCI: Build on Python 3.6 → 3.10
Gerber, Mike
2022-03-03 16:35:26 +0100
-
d726396002
👷🏾♂️ Remove str() on Path objects
Gerber, Mike
2022-03-02 11:19:40 +0100
-
a19224dc46
✔️ CircleCI: Stop testing using Python 3.5
Gerber, Mike
2022-02-28 14:46:34 +0100
-
76bacc0f15
🐛 Bump rapidfuzz dep to >= 2.0.5 (Fixes gh-65)
Gerber, Mike
2022-02-28 14:35:54 +0100
-
195354c6d4
Merge branch 'feat/compare-line-texts'
Gerber, Mike
2022-01-24 18:46:33 +0100
-
-
8a3f5e48c2
🐛 dinglehopper: Patch word_break only once
Gerber, Mike
2022-01-24 18:44:30 +0100
-
b6bde2b7ec
📝 dinglehopper: Document dinglehopper-line-dirs in the README
Gerber, Mike
2021-12-15 11:16:40 +0100
-
f77ce857b2
🚧 dinglehopper: Sahre json_float code
Gerber, Mike
2021-12-14 18:37:07 +0100
-
5b394649a7
🚧 dinglehopper: Compute WER in line-dirs CLI
Gerber, Mike
2021-12-14 18:33:20 +0100
-
cb2be96179
🚧 dinglehopper: Add word differences in line-dirs report
Gerber, Mike
2021-12-14 18:20:04 +0100
-
dbb660615a
🚧 dinglehopper: Compare line text directories (WIP)
Gerber, Mike
2021-12-13 20:02:18 +0100
-
a018006f98
🚧 dinglehopper: Compare line text directories (WIP)
Gerber, Mike
2021-12-13 19:32:55 +0100
-
36b36f6986
🚧 dinglehopper: Compare line text directories (WIP)
Gerber, Mike
2021-12-13 19:26:21 +0100
-
-
f0f3cd2d96
⬆️ dinglehopper: Require rapidfuzz >= 1.9.1
Gerber, Mike
2021-12-14 11:35:57 +0100
-
a5c9c7438f
💩 ocrd-galley: Work around OCR-D/core#730
Gerber, Mike
2021-11-05 17:05:54 +0100
-
7d26b049d1
Merge branch 'fix/ci-py310'
Gerber, Mike
2021-10-26 13:28:57 +0200
-
-
51a44895dc
⬆️ CircleCI: Add Python 3.10
Gerber, Mike
2021-10-26 13:24:50 +0200
-
-
1f8fa5176f
Revert "⬆️ CircleCI: Add Python 3.10"
Gerber, Mike
2021-10-23 15:22:57 +0200
-
b2b21839c2
⬆️ CircleCI: Add Python 3.10
Gerber, Mike
2021-10-22 18:41:47 +0200
-
7d85e21cbc
⬆️ CircleCI: Switch to the new cimg/python image
Gerber, Mike
2021-10-22 18:39:54 +0200
-
dea0c53f88
Merge branch 'rapidfuzz'
Gerber, Mike
2021-10-22 18:19:58 +0200
-
-
06ea38449c
📝 dinglehopper: Update Levenshtein notebook
Gerber, Mike
2021-10-22 16:58:40 +0200
-
3ee688001a
🧹 dinglehopper: Directly import levenshtein() from rapidfuzz
Gerber, Mike
2021-10-22 16:30:21 +0200
-
5d496df267
⚡ dinglehopper: Remove tests that only test rapidfuzz's levenshtein()
Gerber, Mike
2021-10-22 16:26:55 +0200
-
091f069b3c
⚡ dinglehopper: Remove tests that only test rapidfuzz's levenshtein_ops()
Gerber, Mike
2021-10-22 16:21:16 +0200
-
af8da1d716
⚡ dinglehopper: Use rapidfuzz for editops
Gerber, Mike
2021-10-22 15:38:59 +0200
-
-
9f8f88df1f
Reintroduce tooltips in report.
Benjamin Rosemann
2021-06-15 08:58:56 +0200
-
12dcdb81da
Add metrics parameter to integration test
Benjamin Rosemann
2021-06-14 17:08:02 +0200
-
7642a53091
Allow disabling the html report.
Benjamin Rosemann
2021-06-14 16:25:31 +0200
-
e8ccffb275
Updated reports and dependencies.
Benjamin Rosemann
2021-06-14 15:52:14 +0200
-
40f23b8482
Added comments
Benjamin Rosemann
2021-06-14 12:29:34 +0200
-
cee7b6891b
Fix CI Build
Benjamin Rosemann
2021-06-12 09:43:02 +0200
-
714b569195
Fixed some flake8 and mypy issues.
Benjamin Rosemann
2021-06-11 16:09:19 +0200
-
a44a3d4bf2
Error handling
Benjamin Rosemann
2021-06-11 15:33:13 +0200
-
06468a436e
Implemented new metrics behaviour
Benjamin Rosemann
2021-06-11 15:08:45 +0200
-
9f5112f8f6
Remove support for ExtractedText for bag metrics.
Benjamin Rosemann
2021-06-11 10:23:26 +0200
-
381fe7cb6b
Switch to result tuple instead of multiple return parameters
Benjamin Rosemann
2021-06-11 10:21:23 +0200
-
974ca3e5c0
Split html and json report generation
Benjamin Rosemann
2021-06-11 09:35:26 +0200
-
8cd624f795
Add BoC and BoW metric
Benjamin Rosemann
2021-06-08 17:41:44 +0200
-
4ccae9432d
Move metrics into separate package
Benjamin Rosemann
2021-05-27 16:37:34 +0200
-
45465f8d13
Remove restriction on Python 3.5
Benjamin Rosemann
2021-05-27 16:26:02 +0200
-
-
249787686f
Merge branch 'master' of github.com:qurator-spk/dinglehopper
Gerber, Mike
2021-05-20 09:42:15 +0200
-
-
2a6cc5823e
🐛 dinglehopper: Call initLogging before logging
Gerber, Mike
2021-05-20 09:39:09 +0200
-
675a096dfe
Remove restrictions on numpy
Benjamin Rosemann
2021-05-19 15:02:49 +0200
-
0b9af3a21e
Merge pull request #58 from kba/unorderedgroupindexed
Mike Gerber
2021-05-18 18:32:32 +0200
-
-
7fde00d911
ReadingOrder may also contain UnorderedGroupIndexed
Konstantin Baierer
2021-05-18 17:34:08 +0200
-
-
a39a89a50d
Adapt version matrix
Benjamin Rosemann
2021-05-05 16:52:24 +0200
-
685c37ece3
Test missing trigger
Benjamin Rosemann
2021-05-05 16:38:26 +0200
-
0f69ec85fa
Also consider packages on CircleCI
Benjamin Rosemann
2021-05-05 16:31:18 +0200
-
72ad03b4df
Test triggering via .allowed-licenses
Benjamin Rosemann
2021-05-05 16:27:15 +0200
-
1232dee64a
Test with version specific requirement files
Benjamin Rosemann
2021-05-05 16:13:05 +0200
-
15e584f0ab
Introduce version pinning and license checcking
Benjamin Rosemann
2021-05-05 15:20:35 +0200
-
-
-
1778b36a9a
🚧 dinglehopper: Read PAGE UnorderedGroup in XML order
Gerber, Mike
2021-04-15 21:09:45 +0200
-
85b784f9a1
Fix problem with json encoding
Benjamin Rosemann
2021-02-16 11:23:37 +0100
-
9e64c4f0d0
Remove obsolete test
Benjamin Rosemann
2020-11-27 11:31:25 +0100
-
b9259b9d01
Add multiprocessing to flexible_character_accuracy
Benjamin Rosemann
2020-11-26 09:58:40 +0100
-
c4f75d5264
Increase cache size for bad OCR results.
Benjamin Rosemann
2020-11-24 17:10:59 +0100
-
84d34f5b26
Fix annoying logging exceptions and encoding errors.
Benjamin Rosemann
2020-11-24 17:10:18 +0100