Commit Graph

91 Commits (352783cf7c2710ed9ef4d31f4ea4b2dacf187148)
 

Author SHA1 Message Date
Robert Sachunsky 352783cf7c
non-legacy namespace package
Kai Labusch 06c8b382db character normalization based on aletheia mapping
Kai Labusch eac71b3e40
Merge pull request from qurator-spk/fix-ppn-xpath
make xpath for PPN number more specific to avoid catching the PPN of containing work
Konstantin Baierer 3a8bfa74cc
fix namespace typo: s/mets/mods/
Co-authored-by: Stefan Weil <sw@weilnetz.de>
Kai Labusch 2f7d01c7cd fix alto2tsv bug
Kai Labusch eb750752c6
Merge pull request from stweil/typo
Fix typo (found by codespell)
Stefan Weil 3f35554a70 Fix typo (found by codespell)
Signed-off-by: Stefan Weil <sw@weilnetz.de>
Kai Labusch fa1c6b5aa4
Merge pull request from stweil/gitignore
.gitignore: Ignore build directory
Stefan Weil 175694d25d .gitignore: Ignore build directory
That directory is created by `make all` from ocrd_all and should be ignored
to get a clean `git status`.

Signed-off-by: Stefan Weil <sw@weilnetz.de>
Kai Labusch 0ec6f83c4c add alto2tsv
Konstantin Baierer 82769077df make xpath for PPN number more specific to avoid catching the PPN of containing work
Konstantin Baierer 0f64f07635 📦 v0.0.1
Konstantin Baierer 3b10dcb05b Merge branch 'ocrd-processors' of https://github.com/kba/page2tsv into ocrd-processors
# Conflicts:
#	setup.py
Konstantin Baierer 1c0c1cd525 ocrd processors: use snake_case for add_file
Konstantin Baierer e1a440b91c install into qurator namespace
Konstantin Baierer abeca0df16 drop requirement for matplotlib (not used)
Konstantin Baierer db25239075 Merge branch 'master' into ocrd-processors
# Conflicts:
#	setup.py
Kai Labusch a0e5c82929
Merge branch 'master' into ocrd-processors
Kai 75796b5c0c refactor
Konstantin Baierer 81ba7cff82 tests
Konstantin Baierer 60a07c6310 drop support for scaling, not necessary for SBB use case anymore
Konstantin Baierer fe4a1eabb1 setup.py: use ocrd-tool.json for version
Konstantin Baierer aabcc4866d remove obsolete tsv.py (now in qurator-sbb-tools
Konstantin Baierer f813c45ba2 Merge remote-tracking branch 'origin/master' into ocrd-processors
Konstantin Baierer aeb67e445f implement page2tsv/tsv2page as ocrd-neat-{ex,im}port
Konstantin Baierer 0aee20a7f6 cli: separate tsv2page and tsv2page_cli
Konstantin Baierer fe0c355e5a cli: produce TSV if no words are transcribed
Konstantin Baierer 93ee53c8e2 cli: split page2tsv from page2tsv_cli
Kai 9d2d5fcd31 add missing imports
Kai 568e1cd104 remove ner/ned code from page2tsv package
Kai ed90193c45 support segmentation only Page-XML
Kai ee5f03ce07 change default scale factor to 1.0
Kai 5e60fabe4a revert changes
Kai e5b635ec2d try other coordinate computation
Kai f320904503 try other coordinate computation
Kai 1eb05d0d62 xlrd does not support xsls files anymore
Kai ae93668bac xlrd does not support xsls files anymore
Kai 2bd4ae8d5a add ned-priority option to page2tsv
Kai d4eb95b64b make code more robust
Kai 49861b1652 support confidences in find-entities
Kai 0da38d6ec6 support confidences in find-entities
Kai 9b3198e401 add priority option for find-entities
Kai 7b53cc5539 add priority option for find-entities
Kai 318d9bd122 fix
Kai Labusch abcdb67e9e
Merge pull request from kba/lineid-ocr-tsv
Retain line_id, tsv2page CLI to propagate results back to PAGE-XML
Konstantin Baierer f03acbf54d tsv2page CLI to propagate TSV results back to PAGE-XML
Konstantin Baierer ad379aea2b store pc:TextLine ID in TSV, fix
Kai Labusch 9c63631d7a
Merge pull request from kba/core-page-api
use OCR-D/core PAGE API for reading order and recursive regions
Konstantin Baierer 675c88a67d requirements: ocrd pulls in requests already
Konstantin Baierer d80b02c56d use OCR-D/core PAGE API for reading order and recursive regions