Commit Graph

13 Commits (e371da899e65e53e7fa0ca0923000d270356fae9)

Author SHA1 Message Date
Benjamin Rosemann e371da899e Switch from custom Levenshtein to python-Levenshtein
As the distance and editops calculation is a performance bottleneck in
this application we substituted the custom Levenshtein implementation to
the C implementation in the python-Levenshtein package.

We now also have separate entrypoints for texts with unicode normalization
and without because this also can be done more efficiently once upon
preprocessing.
4 years ago
Gerber, Mike 8cd8314c8a 🐛 dinglehopper: Bump up ocrd req for zip_input_files
See also GH-49.
4 years ago
Gerber, Mike f2367ac0c3 🐛 Fix OCR-D CLI for newest OCR-D
Now that find_files() is a generator, we can't use [0] to get the file.
5 years ago
Gerber, Mike 5ed184c8c4 dinglehopper: Show a progressbar on --progress 5 years ago
Gerber, Mike f50591abac Merge branch 'feat/display-segment-id' 5 years ago
Gerber, Mike b14c35e147 🎨 dinglehopper: Use multimethod to handle str vs ExtractedText 5 years ago
Konstantin Baierer 004ae298ca ocrd cli: use make_file_id and assert_file_grp_cardinality 5 years ago
Gerber, Mike 2c69e077fe 🚧 dinglehopper: WIP data structure for extracted text 5 years ago
Gerber, Mike cdfd4d321d 🐛 dinglehopper: Add missing requirement MarkupSafe 5 years ago
Gerber, Mike 48a31ce672 Revert "Merge branch 'master' of https://github.com/qurator-spk/sbb_textline_detector"
This reverts commit 2c89bf3b35ee290d7b830ef270df3a96aa48245e, reversing
changes made to 9f7e413148ca5dbac9b555d7b0d0a5fa3a0f5340.
5 years ago
b-vr103 1303a7d92f Merge branch 'master' of https://github.com/qurator-spk/sbb_textline_detector 5 years ago
Gerber, Mike 02a0e093bf dinglehopper: Add OCR-D interface 6 years ago
Gerber, Mike 89048bf55d ➡ Move dinglehopper into its own directory 6 years ago