1
0
Fork 0
mirror of https://github.com/mikegerber/ocrd_calamari.git synced 2025-06-09 19:59:53 +02:00
Commit graph

147 commits

Author SHA1 Message Date
3156121ff7
📝 Let intro mention ocrd_calamari + PAGE XML, link to OCR-D 2020-10-01 13:21:14 +02:00
bb9b1ab41b 🐛 CircleCI: Ignore screenshots branch (second try) 2020-09-03 11:55:00 +02:00
7705374cfc 🐛 CircleCI: Ignore screenshots branch 2020-09-03 11:44:47 +02:00
c417a0ab77 📝 README: Add a screenshot of example output 2020-09-03 11:31:11 +02:00
210c126003
Merge pull request #42 from OCR-D/file-ids-and-such
use make_file_id and assert_file_grp_cardinality
2020-08-06 17:37:06 +02:00
Konstantin Baierer
f746b73fd0 use make_file_id and assert_file_grp_cardinality 2020-08-06 15:24:53 +02:00
f6dfedf837 🗒️ README-DEV: Also release on GitHub 2020-08-06 14:04:17 +02:00
86410110bc 📦 v0.0.7 2020-08-06 12:39:45 +02:00
4c85b8311f
Merge pull request #41 from OCR-D/fix/set-pcgtsid
Set pcGtsId
2020-08-06 12:37:17 +02:00
7da45a0ec1 Set pcGtsId
Newest OCR-D validation checks PAGE-XML pcGtsId against METS file/@ID.
Set the pcGtsId here correctly.

Fixes #40.
2020-08-06 12:31:47 +02:00
046e3e8ee3 🚧 Tests: Add some TODOs re data + namespace version changes 2020-08-06 11:27:59 +02:00
027fcd7d75 🐛 Fix test file path 2020-07-21 20:10:36 +02:00
c6ced9b3e9
Merge pull request #39 from OCR-D/dont-install-test
setup.py: exclude "test", not "tests", from installation
2020-06-04 12:07:03 +02:00
kba
e03ff4064b setup.py: exclude "test", not "tests", from installation 2020-05-31 20:31:06 +02:00
fb538845d8
📄 Update license (Fixes #35)
Set copyright owner name. Also, going along the lines of "update the year when substantial revision of the work happenend", set the copyright years. The latter may be not be necessary, because "life of author + 70 years" or something.
2020-02-13 16:49:11 +01:00
123ee61a8b v0.0.6 2020-02-13 16:04:17 +01:00
69df78bce1 🐛 setup.py: Fix GitHub url by s/kba/OCR-D 2020-02-13 16:02:02 +01:00
62e5e0c295 🐛 ocrd-tool.json: Fix GitHub url by s/kba/OCR-D 2020-02-13 16:00:58 +01:00
0334a35870 🐛 Sort predictions in exactly the same way, also when building the text 2020-02-12 17:18:37 +01:00
0c9e1f13c7 🐛 Sort predictions in exactly the same way to make sure we are correctly removing spaces 2020-02-12 16:38:45 +01:00
d2c843aa3f 📦 v0.0.5 2020-02-12 13:33:38 +01:00
cd8f6a5fcb 🐛 Use line id for debug message 2020-02-12 13:32:10 +01:00
5b6d8b3f41 🐛 Build line text on our own
Calamari does whitespace post-processing on prediction.sentence, while
it does not do the same on prediction.positions. Do it on our own to
have consistency.

Fixes GH-37.
2020-02-12 12:25:25 +01:00
30f7e1b246 🐳 Docker: Run pip3 check for good measure 2020-02-06 14:01:36 +01:00
303172b279 📝 Document make targets 2020-02-06 13:53:55 +01:00
a2d1d76dbd 🐳 Docker: Do not use the make target to install calamari-ocr, stick to pip 2020-02-06 13:52:05 +01:00
41f5c8a8fa 🐳 Docker: Upgrade pip to silence warning and fix potential other problems 2020-02-06 13:44:43 +01:00
7c18b1d391 🐳 Docker: Use ocrd/core:master instead of outdated :edge 2020-02-06 13:43:59 +01:00
1fda419f25 🐳 Fix Docker build 2020-02-06 13:43:36 +01:00
71096493ac 📝 README-DEV: Improve info about releasing 2020-02-06 13:04:29 +01:00
b26194179c 📝 README-DEV: Improve markdown 2020-02-06 13:03:06 +01:00
cf7a788854 📝 README-DEV: Mention cleaning up the dict/ directory 2020-02-06 13:02:02 +01:00
4508e3ec47 📦 v0.0.4 2020-02-05 17:55:51 +01:00
73beab1770 📝 README: Add a missing cd 2020-02-05 17:49:31 +01:00
3416a155ec 📝 README: Provide a complete example using real data and other processors
See #33.
2020-02-05 17:39:49 +01:00
f2001a79f1 Merge branch 'master' of https://github.com/OCR-D/ocrd_calamari 2020-02-05 16:19:12 +01:00
3e426b2a0a 📝 README: Use gt4histocr-calamari from the Makefile in the example
See #33.
2020-02-05 16:18:30 +01:00
46fe34400f
📝 README: Link to the correct ocrd-tool.json 2020-02-05 13:33:52 +01:00
0c7cd69526
📝 README: Update intro that we're mostly on par with Calamari's functionality 2020-02-05 13:33:02 +01:00
b802b4deaf Allow configuring a cut off confidence value for glyph alternatives 2020-02-05 13:29:44 +01:00
e39a2bce01 📝 Fix example parameters JSON 2020-02-05 13:07:56 +01:00
ef3fb44fb5 Allow controlling of output hierarchy level, e.g. only line, not words+glyphs 2020-02-05 13:02:10 +01:00
0f0bae18ba Remove GT text to not accidently check it instead of OCR text 2020-02-04 19:29:56 +01:00
82fe0333f1 Test word segmentation (Fixes #30) 2020-02-04 18:40:06 +01:00
9010250911 ♻ test: Move binarization into the workspace fixture 2020-02-04 13:54:45 +01:00
6f4736f8e4 Do word segmentation as expected by OCR-D PAGE specs 2020-02-03 19:10:16 +01:00
0f9c94e7dc 🐛 Start with TextEquiv index=1 to adhere to OCR-D PAGE conventions
https://ocr-d.github.io/page#multiple-textequivs
2020-02-03 17:40:45 +01:00
909632493b 🚧 Add future TODOs 2020-02-03 17:37:19 +01:00
3149e1d9e0 📝 unwanted() 2020-02-03 15:33:38 +01:00
91cca1e1b8 📝 Document why we are using Unicode text segmentation to produce word results 2020-02-03 15:33:11 +01:00