Commit Graph

33 Commits (master)

Author SHA1 Message Date
Gerber, Mike b2e02dbf64 🐛 ppn2ocr: Don't break now that we have IIIF URLs
continuous-integration/drone/push Build is failing Details
2 years ago
Gerber, Mike c65dbf9b1f 🧹 ppn2ocr: Remove obsolete comments re file: URLs 3 years ago
Gerber, Mike 9a2cfa35d1 🎨 ppn2ocr: Fix bad indentation 3 years ago
Gerber, Mike f197f01d3f ppn2ocr: Keep only wanted file groups 3 years ago
Gerber, Mike 91296ffa0e ⚙️ ppn2ocr: Move pruning file groups into a function 3 years ago
Gerber, Mike 6ae4bc8e3a ⚙️ ppn2ocr: Use new API_URL (https://oai.sbb.berlin) 3 years ago
Gerber, Mike 7add6858fc 🐛 ppn2ocr: Gracefully handle documents without DEFAULT, e.g. multi-volume works
continuous-integration/drone/push Build was killed Details
3 years ago
Gerber, Mike 691be243f6 Use MAX file group name instead of BEST
continuous-integration/drone/push Build is failing Details
We were using the file group name BEST for what Kitodo seems to call
MAX by convention. So we use MAX now.

Currently, we work under the assumption that, if MAX exists in the METS
retrieved by OAI-PMH, it's not what we want and we replace it with our
own IIIF URLS with full size.

Fixes GH-43.
3 years ago
Gerber, Mike 64d0d85d3e 🎨 ppn2ocr: Fix some whitespace code style issues 4 years ago
Gerber, Mike 1a532b1ccc Validate PPN argument
ppn2ocr expects the PPN to be in the PPNxxxxxxx format, i.e. including
the leading 'PPN' string. Validate the argument accordingly.
4 years ago
Gerber, Mike f7b43bbefa ppn2ocr: Support TIFF in the BEST group 4 years ago
Gerber, Mike 4e37a52899 Merge branch 'master' of github.com:mikegerber/my_ocrd_workflow 4 years ago
Gerber, Mike bb703152db 🐛 ppn2ocr: Verify oai.sbb.berlin's certificate again
Now that oai.sbb.berlin's certificate chain is fixed, remove the
workaround again.

Fixes GH#15.
4 years ago
Gerber, Mike af4557fb33 Merge branch 'master' of https://github.com/mikegerber/my_ocrd_workflow 4 years ago
Gerber, Mike d2c316285c 🧹 ppn2ocr: Remove obsolete show_help() 4 years ago
Gerber, Mike f5b2eed8a6 🐛 ppn2ocr: Work around oai.sbb.berlin certificate problem
oai.sbb.berlin does not have a valid certificate:

% curl https://oai.sbb.berlin
curl: (60) SSL certificate problem: unable to get local issuer certificate
More details here: https://curl.haxx.se/docs/sslcerts.html

curl failed to verify the legitimacy of the server and therefore could not
establish a secure connection to it. To learn more about this situation and
how to fix it, please visit the web page mentioned above.

Work around this by setting verify=False.
4 years ago
Gerber, Mike 448bf9e256 🐛 ppn2ocr: Remove LOCAL file group too 4 years ago
Gerber, Mike 4e19e2a655 💄 ppn2ocr: Add a proper CLI interface 4 years ago
Gerber, Mike 70eb73e4c7 🧹 ppn2ocr: (Re)Move TODOs 4 years ago
Gerber, Mike 05dbffeb7a 🚧 ppn2ocr: Do not call workflow for now 4 years ago
Gerber, Mike 10f5198fa6 🚧 ppn2ocr: s/contain/encapsulate 4 years ago
Gerber, Mike f893b339c5 🚧 ppn2ocr: Properly remove the PRESENTATION file group 4 years ago
Gerber, Mike 014e70fe35 🚧 ppn2ocr: Actually run the workflow 4 years ago
Gerber, Mike 74cb361723 🚧 ppn2ocr: Extract a function to contain the IIIF hack 4 years ago
Gerber, Mike c7c8934e89 🚧 ppn2ocr: Convert to Python + fumble in IIIF URLs 4 years ago
Gerber, Mike 7c5cbc7244 📝 ppn2ocr: Add to README, including proxy configuration 4 years ago
Gerber, Mike 1585247482 ppn2ocr: Make PPN a command line parameter 4 years ago
Gerber, Mike 2a4b204fbe 🎨 ppn2ocr: Extract a function to make a workspace 4 years ago
Gerber, Mike 18d4ab0ba1 ppn2ocr: Use a better example document 4 years ago
Gerber, Mike 8024064697 🐛 ppn2ocr: Fix file:/ links to use file:///, and remove unavaiblable LOCAL file group 4 years ago
Gerber, Mike 3b60b26c53 🐛 ppn2ocr: Do not set no_proxy here 4 years ago
Gerber, Mike 5675047047 🧹 ppn2ocr: We already use run-docker-hub 4 years ago
Gerber, Mike 770af0a205 🚧 WIP: Add script ppn2ocr to run a document by giving PPN 4 years ago