Commit Graph

256 Commits (92391747a76d0991a4cd0a09618804c2644bc59b)
 

Author SHA1 Message Date
Gerber, Mike 8a04602044 📝 README: Not using podman anymore
Gerber, Mike 0782dbde32 ⬆ Update dinglehopper
Gerber, Mike 343a3fbf82 🔧 Evaluate both Tesseract and Calamari results
Gerber, Mike 0bc06c2fad Run Calamari OCR
Gerber, Mike 001e62f54a 🔧 Use docker, not podman
Gerber, Mike daed87566e 🚑 Don't install typegroups classifier for now
Gerber, Mike d8f3438ac5 🚑 Don't check pixel density
Gerber, Mike b169f35bb1 🔧 Build container with cache again
Gerber, Mike 85ff80d548 Use dinglehopper's new OCR-D interface
Gerber, Mike d5aa273b44 🚧 Use ocr-eval aka dinglehopper
Gerber, Mike be5750f4e1 As a last step, downgrade to PAGE 2018 to support PAGE Viewer
Gerber, Mike cf2b4de2a0 🧹 Validate again after fixing image references
Gerber, Mike 21e00932be 🐛 Use a valid filegrp USE for fontident
Gerber, Mike ade39a278c 🎨 Align file groups
Gerber, Mike 3fee2d4fe6 📌 Use my ocrd_typegroups_classifier fix for passing down the page id
Gerber, Mike 44772f1923 🚧 Work around problems with ocrd-tesserocr producing TextEquiv/@conf
Gerber, Mike 8b67866aac Validate PAGE XML after OCR
Gerber, Mike 0d7fd21446 Validate workspace after each step
Gerber, Mike d37db86da1 📌 Use my ocrd_kraken fix for passing down the page id
Gerber, Mike 4addde2e19 Use PAGE 2019
Gerber, Mike de841746e3 Use PAGE 2019
Gerber, Mike ff0570e151 Use frk for now
Gerber, Mike b4f5d44ac8 🐛 ocrd-bugs: bug-ocropy-segment-littering.sh
Gerber, Mike cc81afa1a5 🧹 No need to clean up after tesserocr
Gerber, Mike 89a2893e4e I do not care for the multiple mets:agents elements
Gerber, Mike 0e63fa1756 ⁉ PyTessApi seems to use both engine modes
Gerber, Mike e3a1afbc93 📝 Document the functions
Gerber, Mike 2204aee104 🐋 Docker: Simplify requirements install
Gerber, Mike ddda6e48bc 🐛 Add my collection of OCR-D bug reproducers
Gerber, Mike cfa7d10747 📜 Add README.md
Gerber, Mike 0ea6b02fff ⬆ Update to ocrd >= 1.0.0b10
Gerber, Mike d49c0bd2d1 XXX Do not run privileged, use udica instead
Gerber, Mike 964aef1393 🐛 Use my version of ocrd_models until fix is merged
Gerber, Mike 51a2ccc224 🧹 Remove container after run
Gerber, Mike f3e37dd16c Do not hardcode path to typegroups model binary
Gerber, Mike 3f366339ad Add container setup
Gerber, Mike 8d66469621 Binarize images before segmenting
Gerber, Mike 5e1ece4877 Use ocrd-tesserocr-segment-*
Gerber, Mike e30f03699c TODO Binarization
Gerber, Mike 0d5b5b1b17 XXX does ocrd_tesserocr use the LSTM engine?
Gerber, Mike 16f2f16dbe XXX <error>INCONSISTENCY in TextRegion ID 'dummy'
Gerber, Mike 89abc507e0 XXX ocrd-ocropy-segment throws an exception for buerger_gedichte_1778.ocrd
Gerber, Mike ad3a7c2b95 XXX remove_filegrp link to OCR-D issue
Gerber, Mike f94230c587 Set log level to DEBUG again
Gerber, Mike 2b2c39d6d4 Add a global LOG_LEVEL option
Gerber, Mike fbc3b8ca4f Fix image references
Gerber, Mike b6c490e18b Add a PAGE fix XML step
Gerber, Mike d98ce2d2d4 Add a PAGE validation step
Gerber, Mike 10c4068a99 XXX Global -l DEBUG
Gerber, Mike f8f44e990d Clean up after ocrd-ocropy-segment's mess