Commit Graph

69 Commits (2b9cab1a1aeb22fa46f36e2faac87de0e8d0c788)
 

Author SHA1 Message Date
Gerber, Mike 2b9cab1a1a ⬆ Update ocrd_calamari
Gerber, Mike e5cd5b937e Run pip3 list for easier checking
Gerber, Mike bd24624bd7 ⬆ Do not downgrade to PAGE 2018 anymore
Gerber, Mike 0b2b66a0b4 🔧 Allow setting LOG_LEVEL
Gerber, Mike d6c38b5b9f 🧹 Do not install extra tesserocr
Gerber, Mike f19bba45b8 💩 Remove mysterious TEMP directory for now
Gerber, Mike 68902f923d 📜 Downgrading to PAGE 2018 is not the last step anymore
Gerber, Mike 6c0d7e0aee 💩 Do not fix PAGE image references for now
Gerber, Mike e3bf65b502 ⬆ Update dinglehopper
Gerber, Mike 87968cd297 🧹 README: Move TODO to my usual TODO list
Gerber, Mike debecf71b9 💩 Install the right Pillow version manually...
Gerber, Mike a3d6befb0d 🏗 Build Tesseract from source
Gerber, Mike d903e3634c 📝 README: Clarify workspace TODO
Gerber, Mike 8a04602044 📝 README: Not using podman anymore
Gerber, Mike 0782dbde32 ⬆ Update dinglehopper
Gerber, Mike 343a3fbf82 🔧 Evaluate both Tesseract and Calamari results
Gerber, Mike 0bc06c2fad Run Calamari OCR
Gerber, Mike 001e62f54a 🔧 Use docker, not podman
Gerber, Mike daed87566e 🚑 Don't install typegroups classifier for now
Gerber, Mike d8f3438ac5 🚑 Don't check pixel density
Gerber, Mike b169f35bb1 🔧 Build container with cache again
Gerber, Mike 85ff80d548 Use dinglehopper's new OCR-D interface
Gerber, Mike d5aa273b44 🚧 Use ocr-eval aka dinglehopper
Gerber, Mike be5750f4e1 As a last step, downgrade to PAGE 2018 to support PAGE Viewer
Gerber, Mike cf2b4de2a0 🧹 Validate again after fixing image references
Gerber, Mike 21e00932be 🐛 Use a valid filegrp USE for fontident
Gerber, Mike ade39a278c 🎨 Align file groups
Gerber, Mike 3fee2d4fe6 📌 Use my ocrd_typegroups_classifier fix for passing down the page id
Gerber, Mike 44772f1923 🚧 Work around problems with ocrd-tesserocr producing TextEquiv/@conf
Gerber, Mike 8b67866aac Validate PAGE XML after OCR
Gerber, Mike 0d7fd21446 Validate workspace after each step
Gerber, Mike d37db86da1 📌 Use my ocrd_kraken fix for passing down the page id
Gerber, Mike 4addde2e19 Use PAGE 2019
Gerber, Mike de841746e3 Use PAGE 2019
Gerber, Mike ff0570e151 Use frk for now
Gerber, Mike b4f5d44ac8 🐛 ocrd-bugs: bug-ocropy-segment-littering.sh
Gerber, Mike cc81afa1a5 🧹 No need to clean up after tesserocr
Gerber, Mike 89a2893e4e I do not care for the multiple mets:agents elements
Gerber, Mike 0e63fa1756 ⁉ PyTessApi seems to use both engine modes
Gerber, Mike e3a1afbc93 📝 Document the functions
Gerber, Mike 2204aee104 🐋 Docker: Simplify requirements install
Gerber, Mike ddda6e48bc 🐛 Add my collection of OCR-D bug reproducers
Gerber, Mike cfa7d10747 📜 Add README.md
Gerber, Mike 0ea6b02fff ⬆ Update to ocrd >= 1.0.0b10
Gerber, Mike d49c0bd2d1 XXX Do not run privileged, use udica instead
Gerber, Mike 964aef1393 🐛 Use my version of ocrd_models until fix is merged
Gerber, Mike 51a2ccc224 🧹 Remove container after run
Gerber, Mike f3e37dd16c Do not hardcode path to typegroups model binary
Gerber, Mike 3f366339ad Add container setup
Gerber, Mike 8d66469621 Binarize images before segmenting