Konstantin Baierer
d6804bd9c3
fix typos
3 years ago
Konstantin Baierer
83adfcfd5a
implement "checkpoint_dir" parameter as a simpler alternative to "checkpoint"
3 years ago
Konstantin Baierer
fe973e58db
add version of calamari in --version output
3 years ago
Gerber, Mike
8fcd331fbd
Merge branch 'feat/update-calamari1'
4 years ago
Konstantin Baierer
e4982aff37
getLogger per method
4 years ago
Konstantin Baierer
f746b73fd0
use make_file_id and assert_file_grp_cardinality
4 years ago
Gerber, Mike
7da45a0ec1
Set pcGtsId
...
Newest OCR-D validation checks PAGE-XML pcGtsId against METS file/@ID.
Set the pcGtsId here correctly.
Fixes #40 .
4 years ago
Gerber, Mike
93190fae3b
⚡ Recognize more than one line at a time (Fixes gh#20)
4 years ago
Gerber, Mike
0334a35870
🐛 Sort predictions in exactly the same way, also when building the text
4 years ago
Gerber, Mike
0c9e1f13c7
🐛 Sort predictions in exactly the same way to make sure we are correctly removing spaces
4 years ago
Gerber, Mike
cd8f6a5fcb
🐛 Use line id for debug message
4 years ago
Gerber, Mike
5b6d8b3f41
🐛 Build line text on our own
...
Calamari does whitespace post-processing on prediction.sentence, while
it does not do the same on prediction.positions. Do it on our own to
have consistency.
Fixes GH-37.
4 years ago
Gerber, Mike
b802b4deaf
✨ Allow configuring a cut off confidence value for glyph alternatives
4 years ago
Gerber, Mike
ef3fb44fb5
✨ Allow controlling of output hierarchy level, e.g. only line, not words+glyphs
4 years ago
Gerber, Mike
6f4736f8e4
✨ Do word segmentation as expected by OCR-D PAGE specs
4 years ago
Gerber, Mike
0f9c94e7dc
🐛 Start with TextEquiv index=1 to adhere to OCR-D PAGE conventions
...
https://ocr-d.github.io/page#multiple-textequivs
4 years ago
Gerber, Mike
909632493b
🚧 Add future TODOs
4 years ago
Gerber, Mike
3149e1d9e0
📝 unwanted()
4 years ago
Gerber, Mike
91cca1e1b8
📝 Document why we are using Unicode text segmentation to produce word results
4 years ago
Gerber, Mike
2650189910
🧹 Add whitespace
4 years ago
Gerber, Mike
f75426060e
🧹 Remove debugging print
4 years ago
Gerber, Mike
decaa7b69f
🎨 Use polygon_from_x0y0x1y1 to build word/glyph polygon
4 years ago
Gerber, Mike
2ccfc7b195
🎨 Set vim textwidth
4 years ago
Gerber, Mike
507bc1ce5e
✨ Include proper word + glyph segmentation
4 years ago
Gerber, Mike
24532f693a
🚧 Use character positions as word segmentation
4 years ago
Gerber, Mike
49b6dfe735
🧹 Clean up trailing whitespace
4 years ago
Gerber, Mike
95281f3d29
✨ Add metadata about the recognition operation w/ parameter info
4 years ago
Gerber, Mike
dc38f0ee51
🎨 Use TOOL constant convention from the other OCR-D processors
4 years ago
Robert Sachunsky
103b1d7671
remove existing annotation below the line level to avoid inconsistency
5 years ago
Konstantin Baierer
0fcc5c1f79
pass version to processor base constructor, fix #14
5 years ago
Gerber, Mike
6d2e15b623
🐛 Overwrite text if it exists
5 years ago
Gerber, Mike
31bdf3e425
⬆ Use image_from_segment instead of deprecated resolve_image_as_pil
5 years ago
Gerber, Mike
ebf0d53640
🚧 Do not hardcode used models
5 years ago
Gerber, Mike
0498f9551e
🚧 Update higher TextEquiv levels
5 years ago
Gerber, Mike
319ce3a467
🚧 s/Ocr/Recognize
5 years ago