Commit graph

34 commits

Author SHA1 Message Date
5dffd843aa 🚧 WIP: Migrate to using ocrd:all image - Move extra script to their own sub-directory 2024-04-25 20:29:29 +02:00
b2e02dbf64 🐛 ppn2ocr: Don't break now that we have IIIF URLs
Some checks failed
continuous-integration/drone/push Build is failing
2022-04-07 18:12:49 +02:00
c65dbf9b1f 🧹 ppn2ocr: Remove obsolete comments re file: URLs 2021-09-15 17:45:26 +02:00
9a2cfa35d1 🎨 ppn2ocr: Fix bad indentation 2021-09-15 17:37:31 +02:00
f197f01d3f ppn2ocr: Keep only wanted file groups 2021-09-15 17:26:14 +02:00
91296ffa0e ⚙️ ppn2ocr: Move pruning file groups into a function 2021-09-15 17:12:11 +02:00
6ae4bc8e3a ⚙️ ppn2ocr: Use new API_URL (https://oai.sbb.berlin) 2021-09-15 17:11:12 +02:00
7add6858fc 🐛 ppn2ocr: Gracefully handle documents without DEFAULT, e.g. multi-volume works
Some checks reported errors
continuous-integration/drone/push Build was killed
2021-03-03 16:17:14 +01:00
691be243f6 Use MAX file group name instead of BEST
Some checks failed
continuous-integration/drone/push Build is failing
We were using the file group name BEST for what Kitodo seems to call
MAX by convention. So we use MAX now.

Currently, we work under the assumption that, if MAX exists in the METS
retrieved by OAI-PMH, it's not what we want and we replace it with our
own IIIF URLS with full size.

Fixes GH-43.
2021-02-18 16:34:25 +01:00
64d0d85d3e 🎨 ppn2ocr: Fix some whitespace code style issues 2020-09-03 17:18:42 +02:00
1a532b1ccc Validate PPN argument
ppn2ocr expects the PPN to be in the PPNxxxxxxx format, i.e. including
the leading 'PPN' string. Validate the argument accordingly.
2020-09-03 16:59:50 +02:00
f7b43bbefa ppn2ocr: Support TIFF in the BEST group 2020-06-23 19:03:58 +02:00
4e37a52899 Merge branch 'master' of github.com:mikegerber/my_ocrd_workflow 2020-06-23 15:17:03 +02:00
bb703152db 🐛 ppn2ocr: Verify oai.sbb.berlin's certificate again
Now that oai.sbb.berlin's certificate chain is fixed, remove the
workaround again.

Fixes GH#15.
2020-06-23 15:15:21 +02:00
af4557fb33 Merge branch 'master' of https://github.com/mikegerber/my_ocrd_workflow 2020-06-18 15:46:46 +02:00
d2c316285c 🧹 ppn2ocr: Remove obsolete show_help() 2020-06-17 16:44:17 +02:00
f5b2eed8a6 🐛 ppn2ocr: Work around oai.sbb.berlin certificate problem
oai.sbb.berlin does not have a valid certificate:

% curl https://oai.sbb.berlin
curl: (60) SSL certificate problem: unable to get local issuer certificate
More details here: https://curl.haxx.se/docs/sslcerts.html

curl failed to verify the legitimacy of the server and therefore could not
establish a secure connection to it. To learn more about this situation and
how to fix it, please visit the web page mentioned above.

Work around this by setting verify=False.
2020-06-09 11:19:25 +02:00
448bf9e256 🐛 ppn2ocr: Remove LOCAL file group too 2020-06-04 19:55:00 +02:00
4e19e2a655 💄 ppn2ocr: Add a proper CLI interface 2020-06-03 15:53:45 +02:00
70eb73e4c7 🧹 ppn2ocr: (Re)Move TODOs 2020-06-03 15:34:00 +02:00
05dbffeb7a 🚧 ppn2ocr: Do not call workflow for now 2020-06-03 10:12:36 +02:00
10f5198fa6 🚧 ppn2ocr: s/contain/encapsulate 2020-06-03 10:11:23 +02:00
f893b339c5 🚧 ppn2ocr: Properly remove the PRESENTATION file group 2020-06-03 10:10:54 +02:00
014e70fe35 🚧 ppn2ocr: Actually run the workflow 2020-06-02 19:25:31 +02:00
74cb361723 🚧 ppn2ocr: Extract a function to contain the IIIF hack 2020-06-02 19:18:06 +02:00
c7c8934e89 🚧 ppn2ocr: Convert to Python + fumble in IIIF URLs 2020-06-02 19:06:31 +02:00
7c5cbc7244 📝 ppn2ocr: Add to README, including proxy configuration 2020-05-22 17:23:49 +02:00
1585247482 ppn2ocr: Make PPN a command line parameter 2020-05-22 17:15:50 +02:00
2a4b204fbe 🎨 ppn2ocr: Extract a function to make a workspace 2020-05-22 16:53:20 +02:00
18d4ab0ba1 ppn2ocr: Use a better example document 2020-05-22 16:45:19 +02:00
8024064697 🐛 ppn2ocr: Fix file:/ links to use file:///, and remove unavaiblable LOCAL file group 2020-05-22 16:09:00 +02:00
3b60b26c53 🐛 ppn2ocr: Do not set no_proxy here 2020-05-18 21:03:06 +02:00
5675047047 🧹 ppn2ocr: We already use run-docker-hub 2020-05-14 16:25:34 +02:00
770af0a205 🚧 WIP: Add script ppn2ocr to run a document by giving PPN 2020-03-09 18:27:29 +01:00