mirror of
https://github.com/qurator-spk/dinglehopper.git
synced 2025-07-11 11:29:57 +02:00
As the distance and editops calculation is a performance bottleneck in this application we substituted the custom Levenshtein implementation to the C implementation in the python-Levenshtein package. We now also have separate entrypoints for texts with unicode normalization and without because this also can be done more efficiently once upon preprocessing. |
||
---|---|---|
.. | ||
data | ||
__init__.py | ||
extracted_text_test.py | ||
test_align.py | ||
test_character_error_rate.py | ||
test_edit_distance.py | ||
test_editops.py | ||
test_integ_align.py | ||
test_integ_character_error_rate_ocr.py | ||
test_integ_cli_valid_json.py | ||
test_integ_edit_distance_ocr.py | ||
test_integ_ocrd_cli.py | ||
test_integ_table_extraction.py | ||
test_integ_word_error_rate_ocr.py | ||
test_ocr_files.py | ||
test_word_error_rate.py | ||
util.py |