You cannot select more than 25 topics
Topics must start with a letter or number, can include dashes ('-') and can be up to 35 characters long.
As the distance and editops calculation is a performance bottleneck in this application we substituted the custom Levenshtein implementation to the C implementation in the python-Levenshtein package. We now also have separate entrypoints for texts with unicode normalization and without because this also can be done more efficiently once upon preprocessing. |
4 years ago | |
---|---|---|
.. | ||
data | 4 years ago | |
__init__.py | 5 years ago | |
extracted_text_test.py | 5 years ago | |
test_align.py | 5 years ago | |
test_character_error_rate.py | 5 years ago | |
test_edit_distance.py | 4 years ago | |
test_editops.py | 4 years ago | |
test_integ_align.py | 5 years ago | |
test_integ_character_error_rate_ocr.py | 5 years ago | |
test_integ_cli_valid_json.py | 5 years ago | |
test_integ_edit_distance_ocr.py | 5 years ago | |
test_integ_ocrd_cli.py | 5 years ago | |
test_integ_table_extraction.py | 4 years ago | |
test_integ_word_error_rate_ocr.py | 5 years ago | |
test_ocr_files.py | 5 years ago | |
test_word_error_rate.py | 5 years ago | |
util.py | 5 years ago |