Commit Graph

53 Commits (master)

Author SHA1 Message Date
Robert Sachunsky 8c2e4ca76d recognize: skip tiny or bin-empty lines, too 2 years ago
Robert Sachunsky 36e513604e descend to all available TextRegions recursively 2 years ago
Robert Sachunsky 01312c6369 recognize: delegate to core functions 2 years ago
Robert Sachunsky 5f23c03cd9
recognize: remove checkpoint param in favour of checkpoint_dir alone 2 years ago
Gerber, Mike 34013ddb02 📝 Reduce process() docstring again 3 years ago
Robert Sachunsky 4c6d6655e1 improve process() docstring 3 years ago
Robert Sachunsky 3bde7cb37f init from constructor not process(), use conventional name setup() 3 years ago
Gerber, Mike 0869386ec4 🐛 Fix word and glyph coordinates
Fixes GH-57.
3 years ago
Gerber, Mike 4cf25b8119 🎨 Rename input_channels variable to network_input_channels 3 years ago
Gerber, Mike c0902cdef5 Merge branch 'master' into image-features 3 years ago
Mike Gerber 53c94fea95
Merge pull request #53 from OCR-D/resolve-resources
Resolve resources
3 years ago
Konstantin Baierer 03f5e44e62 define default for checkpoint_dir, but allow checkpoint still 3 years ago
Mike Gerber a014bab5b6
Merge pull request #49 from OCR-D/fix-48
check for empty line image, ht @andbue, fix #48
3 years ago
Mike Gerber e7fb432e35
Merge pull request #52 from OCR-D/checkpoint_dir
Checkpoint dir
3 years ago
Konstantin Baierer 00e43b1d1f use Processor.resolve_files to handle on-demand download of models via registry 3 years ago
Konstantin Baierer fdd30ebb89 also add tensorflow version to --version output 3 years ago
Konstantin Baierer d6804bd9c3 fix typos 3 years ago
Konstantin Baierer 83adfcfd5a implement "checkpoint_dir" parameter as a simpler alternative to "checkpoint" 3 years ago
Konstantin Baierer fe973e58db add version of calamari in --version output 3 years ago
Konstantin Baierer df530877dc check for empty line image, ht @andbue, fix #48 3 years ago
Gerber, Mike 8fcd331fbd Merge branch 'feat/update-calamari1' 4 years ago
Konstantin Baierer e4982aff37 getLogger per method 4 years ago
Konstantin Baierer f746b73fd0 use make_file_id and assert_file_grp_cardinality 4 years ago
Gerber, Mike 7da45a0ec1 Set pcGtsId
Newest OCR-D validation checks PAGE-XML pcGtsId against METS file/@ID.
Set the pcGtsId here correctly.

Fixes #40.
4 years ago
Gerber, Mike 93190fae3b Recognize more than one line at a time (Fixes gh#20) 4 years ago
Gerber, Mike 0334a35870 🐛 Sort predictions in exactly the same way, also when building the text 4 years ago
Gerber, Mike 0c9e1f13c7 🐛 Sort predictions in exactly the same way to make sure we are correctly removing spaces 4 years ago
Gerber, Mike cd8f6a5fcb 🐛 Use line id for debug message 4 years ago
Gerber, Mike 5b6d8b3f41 🐛 Build line text on our own
Calamari does whitespace post-processing on prediction.sentence, while
it does not do the same on prediction.positions. Do it on our own to
have consistency.

Fixes GH-37.
4 years ago
Gerber, Mike b802b4deaf Allow configuring a cut off confidence value for glyph alternatives 4 years ago
Gerber, Mike ef3fb44fb5 Allow controlling of output hierarchy level, e.g. only line, not words+glyphs 4 years ago
Gerber, Mike 6f4736f8e4 Do word segmentation as expected by OCR-D PAGE specs 4 years ago
Gerber, Mike 0f9c94e7dc 🐛 Start with TextEquiv index=1 to adhere to OCR-D PAGE conventions
https://ocr-d.github.io/page#multiple-textequivs
4 years ago
Gerber, Mike 909632493b 🚧 Add future TODOs 4 years ago
Gerber, Mike 3149e1d9e0 📝 unwanted() 4 years ago
Gerber, Mike 91cca1e1b8 📝 Document why we are using Unicode text segmentation to produce word results 4 years ago
Gerber, Mike 2650189910 🧹 Add whitespace 4 years ago
Gerber, Mike f75426060e 🧹 Remove debugging print 4 years ago
Gerber, Mike decaa7b69f 🎨 Use polygon_from_x0y0x1y1 to build word/glyph polygon 4 years ago
Gerber, Mike 2ccfc7b195 🎨 Set vim textwidth 4 years ago
Gerber, Mike 507bc1ce5e Include proper word + glyph segmentation 4 years ago
Gerber, Mike 24532f693a 🚧 Use character positions as word segmentation 4 years ago
Gerber, Mike 49b6dfe735 🧹 Clean up trailing whitespace 4 years ago
Gerber, Mike 95281f3d29 Add metadata about the recognition operation w/ parameter info 4 years ago
Gerber, Mike dc38f0ee51 🎨 Use TOOL constant convention from the other OCR-D processors 4 years ago
Robert Sachunsky d8db405a4c warn if passing raw images to single-channel models 5 years ago
Robert Sachunsky 103b1d7671 remove existing annotation below the line level to avoid inconsistency 5 years ago
Konstantin Baierer 0fcc5c1f79 pass version to processor base constructor, fix #14 5 years ago
Gerber, Mike 6d2e15b623 🐛 Overwrite text if it exists 5 years ago
Gerber, Mike 31bdf3e425 ⬆ Use image_from_segment instead of deprecated resolve_image_as_pil 5 years ago