Commit Graph

35 Commits (d6804bd9c3047956c90756baa554d849121861d5)

Author SHA1 Message Date
Konstantin Baierer d6804bd9c3 fix typos 3 years ago
Konstantin Baierer 83adfcfd5a implement "checkpoint_dir" parameter as a simpler alternative to "checkpoint" 3 years ago
Konstantin Baierer fe973e58db add version of calamari in --version output 3 years ago
Gerber, Mike 8fcd331fbd Merge branch 'feat/update-calamari1' 4 years ago
Konstantin Baierer e4982aff37 getLogger per method 4 years ago
Konstantin Baierer f746b73fd0 use make_file_id and assert_file_grp_cardinality 4 years ago
Gerber, Mike 7da45a0ec1 Set pcGtsId
Newest OCR-D validation checks PAGE-XML pcGtsId against METS file/@ID.
Set the pcGtsId here correctly.

Fixes #40.
4 years ago
Gerber, Mike 93190fae3b Recognize more than one line at a time (Fixes gh#20) 4 years ago
Gerber, Mike 0334a35870 🐛 Sort predictions in exactly the same way, also when building the text 4 years ago
Gerber, Mike 0c9e1f13c7 🐛 Sort predictions in exactly the same way to make sure we are correctly removing spaces 4 years ago
Gerber, Mike cd8f6a5fcb 🐛 Use line id for debug message 4 years ago
Gerber, Mike 5b6d8b3f41 🐛 Build line text on our own
Calamari does whitespace post-processing on prediction.sentence, while
it does not do the same on prediction.positions. Do it on our own to
have consistency.

Fixes GH-37.
4 years ago
Gerber, Mike b802b4deaf Allow configuring a cut off confidence value for glyph alternatives 4 years ago
Gerber, Mike ef3fb44fb5 Allow controlling of output hierarchy level, e.g. only line, not words+glyphs 4 years ago
Gerber, Mike 6f4736f8e4 Do word segmentation as expected by OCR-D PAGE specs 4 years ago
Gerber, Mike 0f9c94e7dc 🐛 Start with TextEquiv index=1 to adhere to OCR-D PAGE conventions
https://ocr-d.github.io/page#multiple-textequivs
4 years ago
Gerber, Mike 909632493b 🚧 Add future TODOs 4 years ago
Gerber, Mike 3149e1d9e0 📝 unwanted() 4 years ago
Gerber, Mike 91cca1e1b8 📝 Document why we are using Unicode text segmentation to produce word results 4 years ago
Gerber, Mike 2650189910 🧹 Add whitespace 4 years ago
Gerber, Mike f75426060e 🧹 Remove debugging print 4 years ago
Gerber, Mike decaa7b69f 🎨 Use polygon_from_x0y0x1y1 to build word/glyph polygon 4 years ago
Gerber, Mike 2ccfc7b195 🎨 Set vim textwidth 4 years ago
Gerber, Mike 507bc1ce5e Include proper word + glyph segmentation 4 years ago
Gerber, Mike 24532f693a 🚧 Use character positions as word segmentation 4 years ago
Gerber, Mike 49b6dfe735 🧹 Clean up trailing whitespace 4 years ago
Gerber, Mike 95281f3d29 Add metadata about the recognition operation w/ parameter info 4 years ago
Gerber, Mike dc38f0ee51 🎨 Use TOOL constant convention from the other OCR-D processors 4 years ago
Robert Sachunsky 103b1d7671 remove existing annotation below the line level to avoid inconsistency 5 years ago
Konstantin Baierer 0fcc5c1f79 pass version to processor base constructor, fix #14 5 years ago
Gerber, Mike 6d2e15b623 🐛 Overwrite text if it exists 5 years ago
Gerber, Mike 31bdf3e425 ⬆ Use image_from_segment instead of deprecated resolve_image_as_pil 5 years ago
Gerber, Mike ebf0d53640 🚧 Do not hardcode used models 5 years ago
Gerber, Mike 0498f9551e 🚧 Update higher TextEquiv levels 5 years ago
Gerber, Mike 319ce3a467 🚧 s/Ocr/Recognize 5 years ago