Commit Graph

99 Commits (52e8e7f7c3ecb6677113b764a0cac1212f90c59b)
 

Author SHA1 Message Date
Kai Labusch 52e8e7f7c3 Update required python version;update README
Kai Labusch b9cb04389c add drop-columns options to tsv2tsv
Kai Labusch 198221651f fix command line output
Kai Labusch 80cf64abcf make code more robust
Kai Labusch 8d8bf517b9 remove spam
Kai Labusch ef8244a466 add tsv2tsv tool;make easy re-processing of tsv files possible
Kai Labusch 438b10e407 add tsv2tsv tool;make easy re-processing of tsv files possible
Kai Labusch 24ecc16b2d
Merge pull request from r0man-ist/patch-1
Fix small error to prevent recursion on tsv2page
r0man-ist 772e6d1a42
Fix small error to prevent recursion on tsv2page
Kai Labusch 06c8b382db character normalization based on aletheia mapping
Kai Labusch eac71b3e40
Merge pull request from qurator-spk/fix-ppn-xpath
make xpath for PPN number more specific to avoid catching the PPN of containing work
Konstantin Baierer 3a8bfa74cc
fix namespace typo: s/mets/mods/
Co-authored-by: Stefan Weil <sw@weilnetz.de>
Kai Labusch 2f7d01c7cd fix alto2tsv bug
Kai Labusch eb750752c6
Merge pull request from stweil/typo
Fix typo (found by codespell)
Stefan Weil 3f35554a70 Fix typo (found by codespell)
Signed-off-by: Stefan Weil <sw@weilnetz.de>
Kai Labusch fa1c6b5aa4
Merge pull request from stweil/gitignore
.gitignore: Ignore build directory
Stefan Weil 175694d25d .gitignore: Ignore build directory
That directory is created by `make all` from ocrd_all and should be ignored
to get a clean `git status`.

Signed-off-by: Stefan Weil <sw@weilnetz.de>
Kai Labusch 0ec6f83c4c add alto2tsv
Konstantin Baierer 82769077df make xpath for PPN number more specific to avoid catching the PPN of containing work
Konstantin Baierer 0f64f07635 📦 v0.0.1
Konstantin Baierer 3b10dcb05b Merge branch 'ocrd-processors' of https://github.com/kba/page2tsv into ocrd-processors
# Conflicts:
#	setup.py
Konstantin Baierer 1c0c1cd525 ocrd processors: use snake_case for add_file
Konstantin Baierer e1a440b91c install into qurator namespace
Konstantin Baierer abeca0df16 drop requirement for matplotlib (not used)
Konstantin Baierer db25239075 Merge branch 'master' into ocrd-processors
# Conflicts:
#	setup.py
Kai Labusch a0e5c82929
Merge branch 'master' into ocrd-processors
Kai 75796b5c0c refactor
Konstantin Baierer 81ba7cff82 tests
Konstantin Baierer 60a07c6310 drop support for scaling, not necessary for SBB use case anymore
Konstantin Baierer fe4a1eabb1 setup.py: use ocrd-tool.json for version
Konstantin Baierer aabcc4866d remove obsolete tsv.py (now in qurator-sbb-tools
Konstantin Baierer f813c45ba2 Merge remote-tracking branch 'origin/master' into ocrd-processors
Konstantin Baierer aeb67e445f implement page2tsv/tsv2page as ocrd-neat-{ex,im}port
Konstantin Baierer 0aee20a7f6 cli: separate tsv2page and tsv2page_cli
Konstantin Baierer fe0c355e5a cli: produce TSV if no words are transcribed
Konstantin Baierer 93ee53c8e2 cli: split page2tsv from page2tsv_cli
Kai 9d2d5fcd31 add missing imports
Kai 568e1cd104 remove ner/ned code from page2tsv package
Kai ed90193c45 support segmentation only Page-XML
Kai ee5f03ce07 change default scale factor to 1.0
Kai 5e60fabe4a revert changes
Kai e5b635ec2d try other coordinate computation
Kai f320904503 try other coordinate computation
Kai 1eb05d0d62 xlrd does not support xsls files anymore
Kai ae93668bac xlrd does not support xsls files anymore
Kai 2bd4ae8d5a add ned-priority option to page2tsv
Kai d4eb95b64b make code more robust
Kai 49861b1652 support confidences in find-entities
Kai 0da38d6ec6 support confidences in find-entities
Kai 9b3198e401 add priority option for find-entities