mirror of
https://github.com/qurator-spk/dinglehopper.git
synced 2025-11-01 18:04:14 +01:00
As the distance and editops calculation is a performance bottleneck in this application we substituted the custom Levenshtein implementation to the C implementation in the python-Levenshtein package. We now also have separate entrypoints for texts with unicode normalization and without because this also can be done more efficiently once upon preprocessing. |
||
|---|---|---|
| .. | ||
| data | ||
| __init__.py | ||
| extracted_text_test.py | ||
| test_align.py | ||
| test_character_error_rate.py | ||
| test_edit_distance.py | ||
| test_editops.py | ||
| test_integ_align.py | ||
| test_integ_character_error_rate_ocr.py | ||
| test_integ_cli_valid_json.py | ||
| test_integ_edit_distance_ocr.py | ||
| test_integ_ocrd_cli.py | ||
| test_integ_table_extraction.py | ||
| test_integ_word_error_rate_ocr.py | ||
| test_ocr_files.py | ||
| test_word_error_rate.py | ||
| util.py | ||