Commit graph

105 commits

Author SHA1 Message Date
6e3b4e707a 💩 Mark textline_detection tar for update 2019-11-28 17:26:12 +01:00
03badaf887 ⬆ Update sbb_textline_detector 2019-11-28 16:38:41 +01:00
d166077a55 Update to sbb_textline_detector with the fixed AlternativeImage support (= merged PAGE results) 2019-11-20 12:40:05 +01:00
de47a3e5b1 🔥 Remove now unused page_fix_image_references() 2019-11-20 12:39:02 +01:00
eb58448f6d 🎡 Check pip dependencies early 2019-11-19 14:35:23 +01:00
3ecf478f79 ⬆ Update sbb_textline_detector/ocrd_calamari 2019-10-31 18:21:17 +01:00
edd0930952 ⚙ Download files from the web 2019-10-31 15:22:12 +01:00
21df393b0f ⚙ Suggest commands to fix data submodule 2019-10-31 12:41:03 +01:00
cd2e92fbc4 ⚙ Give use choice to fix data sub-dir 2019-10-31 11:32:37 +01:00
eeb733486f ⚙ Sanity-check data submodule 2019-10-30 17:54:05 +01:00
1af18c629e 🧹 Validate imagefilename again 2019-10-30 11:25:34 +01:00
c994be2efb 🐛 Remove obsolete Pillow==5.4.1 dependency (fixes setup) 2019-10-30 11:13:08 +01:00
34b34fc84f 🐛 Do not run as root 2019-10-30 11:12:11 +01:00
6ad8d50552 ⬆ Update sbb_textline_detector 2019-10-30 11:11:24 +01:00
2a6df526b5 ⬆ Update sbb_textline_detector 2019-10-22 17:58:24 +02:00
de49aa715b ⬆ Update to OCR-D 1.0.0 2019-10-21 17:04:49 +02:00
7025d960b4 Use ocrd_olena for binarization 2019-10-21 17:04:06 +02:00
63c364207c 💩 Add a funny workaround to get git-annex to give us our files 2019-10-18 16:32:31 +02:00
33e25641f2 ⬆ Update sbb_textline_detector 2019-10-18 13:22:47 +02:00
2b67f5feb4 ⬆ Update sbb_textline_detector 2019-10-16 12:50:31 +02:00
3687d6d7b4 🧹 Do not remove line confidences anymore 2019-10-11 19:17:30 +02:00
6454d20998 Use sbb_textline_detector to segment lines 2019-10-11 19:16:43 +02:00
735e9599d7 🐛 ocrd-bugs: Most/All workspaces in bag files don't validate 2019-10-09 13:36:54 +02:00
0f8f1d814b 🐛 Mkdir robustly 2019-10-07 12:36:40 +02:00
bdab016e2c Use GT4HistOCR_2000000 model from qurator-data for Tesseract 2019-10-02 16:48:28 +02:00
57ff3fc19b ⬆ Update data 2019-10-02 16:02:45 +02:00
ff2cc50aed ⬆ Update dinglehopper (substitutions) 2019-10-01 13:19:16 +02:00
0c5ed94892 ⬆ Update dinglehopper (to fix NFC trouble + substitutions) 2019-10-01 11:30:00 +02:00
1dde641d5a ⬆ Update dinglehopper (to fix text alignment) 2019-09-30 18:26:12 +02:00
47dd5d3b62 🎨 Move XML schemata to a better path 2019-09-30 18:25:54 +02:00
02457155aa ⬆ Update dinglehopper (to fix reading order) 2019-09-30 16:10:13 +02:00
af2034400a 🎨 Add extra newlines to separate steps 2019-09-30 12:26:14 +02:00
1863439d92 💩 Remove extra Pillow dependency workarounds 2019-09-30 12:25:31 +02:00
81b7e5458c 💩 Install Pillow 5.4.1 because pip does not have a dependency resolver 2019-09-27 19:18:34 +02:00
224762e1bb 🐛 Let ocrd_calamari handle the weird setuptools depencency 2019-09-27 16:31:30 +02:00
a272237bd8 ⬆ Update ocrd dependency 2019-09-27 15:42:11 +02:00
2b9cab1a1a ⬆ Update ocrd_calamari 2019-09-27 15:34:38 +02:00
e5cd5b937e Run pip3 list for easier checking 2019-09-27 13:16:14 +02:00
bd24624bd7 ⬆ Do not downgrade to PAGE 2018 anymore 2019-09-27 13:02:46 +02:00
0b2b66a0b4 🔧 Allow setting LOG_LEVEL 2019-09-27 12:09:37 +02:00
d6c38b5b9f 🧹 Do not install extra tesserocr 2019-09-26 18:12:15 +02:00
f19bba45b8 💩 Remove mysterious TEMP directory for now 2019-09-26 16:55:54 +02:00
68902f923d 📜 Downgrading to PAGE 2018 is not the last step anymore 2019-09-26 16:55:02 +02:00
6c0d7e0aee 💩 Do not fix PAGE image references for now 2019-09-26 16:46:12 +02:00
e3bf65b502 ⬆ Update dinglehopper 2019-09-26 15:24:52 +02:00
87968cd297 🧹 README: Move TODO to my usual TODO list 2019-09-24 13:44:15 +02:00
debecf71b9 💩 Install the right Pillow version manually... 2019-09-23 15:04:28 +02:00
a3d6befb0d 🏗 Build Tesseract from source 2019-09-23 15:02:59 +02:00
d903e3634c 📝 README: Clarify workspace TODO 2019-09-23 15:01:27 +02:00
8a04602044 📝 README: Not using podman anymore 2019-09-23 15:01:03 +02:00