b2e02dbf64
🐛 ppn2ocr: Don't break now that we have IIIF URLs
continuous-integration/drone/push Build is failing
2022-04-07 18:12:49 +02:00
c65dbf9b1f
🧹 ppn2ocr: Remove obsolete comments re file: URLs
2021-09-15 17:45:26 +02:00
9a2cfa35d1
🎨 ppn2ocr: Fix bad indentation
2021-09-15 17:37:31 +02:00
f197f01d3f
✨ ppn2ocr: Keep only wanted file groups
2021-09-15 17:26:14 +02:00
91296ffa0e
⚙️ ppn2ocr: Move pruning file groups into a function
2021-09-15 17:12:11 +02:00
6ae4bc8e3a
⚙️ ppn2ocr: Use new API_URL ( https://oai.sbb.berlin )
2021-09-15 17:11:12 +02:00
7add6858fc
🐛 ppn2ocr: Gracefully handle documents without DEFAULT, e.g. multi-volume works
continuous-integration/drone/push Build was killed
2021-03-03 16:17:14 +01:00
691be243f6
✨ Use MAX file group name instead of BEST
...
continuous-integration/drone/push Build is failing
We were using the file group name BEST for what Kitodo seems to call
MAX by convention. So we use MAX now.
Currently, we work under the assumption that, if MAX exists in the METS
retrieved by OAI-PMH, it's not what we want and we replace it with our
own IIIF URLS with full size.
Fixes GH-43.
2021-02-18 16:34:25 +01:00
64d0d85d3e
🎨 ppn2ocr: Fix some whitespace code style issues
2020-09-03 17:18:42 +02:00
1a532b1ccc
✨ Validate PPN argument
...
ppn2ocr expects the PPN to be in the PPNxxxxxxx format, i.e. including
the leading 'PPN' string. Validate the argument accordingly.
2020-09-03 16:59:50 +02:00
f7b43bbefa
✨ ppn2ocr: Support TIFF in the BEST group
2020-06-23 19:03:58 +02:00
4e37a52899
Merge branch 'master' of github.com:mikegerber/my_ocrd_workflow
2020-06-23 15:17:03 +02:00
bb703152db
🐛 ppn2ocr: Verify oai.sbb.berlin's certificate again
...
Now that oai.sbb.berlin's certificate chain is fixed, remove the
workaround again.
Fixes GH#15.
2020-06-23 15:15:21 +02:00
af4557fb33
Merge branch 'master' of https://github.com/mikegerber/my_ocrd_workflow
2020-06-18 15:46:46 +02:00
d2c316285c
🧹 ppn2ocr: Remove obsolete show_help()
2020-06-17 16:44:17 +02:00
f5b2eed8a6
🐛 ppn2ocr: Work around oai.sbb.berlin certificate problem
...
oai.sbb.berlin does not have a valid certificate:
% curl https://oai.sbb.berlin
curl: (60) SSL certificate problem: unable to get local issuer certificate
More details here: https://curl.haxx.se/docs/sslcerts.html
curl failed to verify the legitimacy of the server and therefore could not
establish a secure connection to it. To learn more about this situation and
how to fix it, please visit the web page mentioned above.
Work around this by setting verify=False.
2020-06-09 11:19:25 +02:00
448bf9e256
🐛 ppn2ocr: Remove LOCAL file group too
2020-06-04 19:55:00 +02:00
4e19e2a655
💄 ppn2ocr: Add a proper CLI interface
2020-06-03 15:53:45 +02:00
70eb73e4c7
🧹 ppn2ocr: (Re)Move TODOs
2020-06-03 15:34:00 +02:00
05dbffeb7a
🚧 ppn2ocr: Do not call workflow for now
2020-06-03 10:12:36 +02:00
10f5198fa6
🚧 ppn2ocr: s/contain/encapsulate
2020-06-03 10:11:23 +02:00
f893b339c5
🚧 ppn2ocr: Properly remove the PRESENTATION file group
2020-06-03 10:10:54 +02:00
014e70fe35
🚧 ppn2ocr: Actually run the workflow
2020-06-02 19:25:31 +02:00
74cb361723
🚧 ppn2ocr: Extract a function to contain the IIIF hack
2020-06-02 19:18:06 +02:00
c7c8934e89
🚧 ppn2ocr: Convert to Python + fumble in IIIF URLs
2020-06-02 19:06:31 +02:00
7c5cbc7244
📝 ppn2ocr: Add to README, including proxy configuration
2020-05-22 17:23:49 +02:00
1585247482
✨ ppn2ocr: Make PPN a command line parameter
2020-05-22 17:15:50 +02:00
2a4b204fbe
🎨 ppn2ocr: Extract a function to make a workspace
2020-05-22 16:53:20 +02:00
18d4ab0ba1
✨ ppn2ocr: Use a better example document
2020-05-22 16:45:19 +02:00
8024064697
🐛 ppn2ocr: Fix file:/ links to use file:///, and remove unavaiblable LOCAL file group
2020-05-22 16:09:00 +02:00
3b60b26c53
🐛 ppn2ocr: Do not set no_proxy here
2020-05-18 21:03:06 +02:00
5675047047
🧹 ppn2ocr: We already use run-docker-hub
2020-05-14 16:25:34 +02:00
770af0a205
🚧 WIP: Add script ppn2ocr to run a document by giving PPN
2020-03-09 18:27:29 +01:00