80 Commits (master)

Author SHA1 Message Date
Robert Sachunsky ad3db18273
non-legacy namespace package 6 months ago
vahidrezanezhad eaf8ecd4d4
Merge pull request #62 from mikegerber/remove-unused-variable
Remove unused variable 'possibles'
2 years ago
Gerber, Mike 2553bd39f9 Remove unused variable 'possibles' 2 years ago
Gerber, Mike 8f64deefd6 🐛 Make required options required
When the user does not give, e.g. an input image, the program just
reports a cryptic error message.

Fix this and make the CLI friendlier by requiring the options using the
Click API. Now the CLI gives a useful error:

  Error: Missing option '--image' / '-i'.
2 years ago
Robert Sachunsky 04e1972720 ocrd_tool.json: add model content-type 3 years ago
Konstantin Baierer 5c7fd26883 processor: self.resolve_resource model 4 years ago
vahidrezanezhad c241830b4b
Merge pull request #48 from bertsky/fix-coords
ensure valid coordinates by intersection with parent…
4 years ago
vahidrezanezhad 4c498fcad2
resolving issue https://github.com/qurator-spk/sbb_textline_detection/issues/53 4 years ago
Robert Sachunsky 261db14ec3 ensure valid coordinates by intersection with parent…
- Border: intersect with page frame
- text regions: intersect with (new) Border
- text lines: intersect with (new) text region
  (and back-transform at all)
4 years ago
Gerber, Mike 665b739fb8 🐛 sbb_textline_detector: Re-base Border coords too 4 years ago
Gerber, Mike 006c7765b0 🐛 sbb_textline_detector: Filter cropped images (OCR-D) 4 years ago
Gerber, Mike 37cc513ce9 🚧 sbb_textline_detector: Translate detected coordinates 4 years ago
Gerber, Mike a9b9c8a885 🚧 sbb_textline_detector: Get image via image_from_page 4 years ago
Konstantin Baierer f167f6768c getLogger per method 4 years ago
vahidrezanezhad 0f09f4a1f6
Update main.py
Issues 30 and 40 are resolved
4 years ago
Clemens Neudecker e4798c6811
replace 'PrintSpace' with 'Border' 4 years ago
Clemens Neudecker 36adbe29d8
replace 'PrintSpace' with 'Border' 4 years ago
Konstantin Baierer 05deb03ec8 use make_file_id and assert_file_grp_cardinality 4 years ago
Gerber, Mike 8b01d9e671 🐛 sbb_textline_detection: Set pcGtsId
Newest OCR-D workspace validation requires that the pcGtsId of a
PAGE-XML file matches its METS mets:file/ID. Fix this by setting
it correctly.
4 years ago
Mike Gerber 3593506e72
🔧 ocrd-tool.json: Update description, steps and categories
Fixes #31.
5 years ago
Lucas Sulzbach ead1eae114 ocrd-tool.json: Make description OCR-D compliant 5 years ago
vahidrezanezhad f94944ee80
change scaling 5 years ago
b-vr103 b9caa8e12c resolve 2020-02-17-bug-sbb_textline_detector 5 years ago
b-vr103 1446d7c662 getting robust and doing sth for verticals 5 years ago
b-vr103 3941f2f17d gettin robust and doing sth for verticals 5 years ago
Gerber, Mike f90b3cfa86 🔊 sbb_textline_detector: In OCR-D interface, warn if overwriting existing segmentation 5 years ago
Gerber, Mike 11c0e9cee5 🐛 sbb_textline_detector: Do not print PAGE output to stdout
ocrd-sbb-textline-detector uses ocrd_page's parse() to parse XML input,
which writes the XML to stdout by default.

Fix this by silencing it using parse()'s silence=True.
5 years ago
wrznr 4fc57d7756 Assign page id 5 years ago
wrznr 9e9163e852 Simplify the iteration over files in the input file group 5 years ago
Mike Gerber 6e0decb5ec
Merge pull request #12 from kba/rename-tool
Rename ocrd_sbb.. to ocrd-sbb... in ocrd_cli.py, ht @bertsky
5 years ago
Gerber, Mike 5fb30a7a1f Revert "Merge branch 'master' of https://github.com/qurator-spk/sbb_textline_detector"
This reverts commit 417b9235d5, reversing
changes made to a74974b7b6.
5 years ago
Konstantin Baierer cf6381c148 Rename ocrd_sbb.. to ocrd-sbb... in ocrd_cli.py, ht @bertsky 5 years ago
Clemens Neudecker 51e241fd84
Merge pull request #5 from cneud/cneud-fix-typos
Fix typos
5 years ago
Clemens Neudecker 12c07f389d
Merge pull request #7 from cneud/cneud-fix-docstring
fix docstring
5 years ago
Clemens Neudecker 29870f26e1
Merge pull request #4 from cneud/cneud-PAGE2019
PAGE2019
5 years ago
Konstantin Baierer b6ca1a7c53 kebab-case snake_case executable, fix #9 5 years ago
Clemens Neudecker 6c0bfba686
fix typos 5 years ago
Clemens Neudecker c8bc468628
fix docstring 5 years ago
Clemens Neudecker e696a068cb
Fix typos 5 years ago
Clemens Neudecker d90dad48fd
PAGE2019 5 years ago
Rezanezhad, Vahid 19116091f9 Update config_params.json 5 years ago
Gerber, Mike af5cbe9052 🐛 sbb_textline_detector: Fix making the output file id 5 years ago
Rezanezhad, Vahid 2112bb18c6 fixed the bug: local variable 't4' referenced before assignment 5 years ago
Rezanezhad, Vahid a11f6740cb Update main.py - robust deskewing and better page extraction 5 years ago
Rezanezhad, Vahid 0182b7087f remove multiprocessing bug 5 years ago
Gerber, Mike 8fa7179560 🐛 sbb_textline_detector: Disable multiprocessing to fix race condition
Lines were sorted in the wrong regions. Work around this by disabling
multiprocessing until a proper fix is done.
5 years ago
Gerber, Mike 4aed06a325 sbb_textline_detection: Preserve input PAGE info by merging segmentation results
ocrd_sbb_textline_detection used the output XML by main.py as is, and
– by doing this – threw away any input data from the input PAGE,
including the critical pc:AlternativeImage and the less important
pc:MetadataItem.

Fix this by merging the segmentation results into a file created from
the input file.

Also add a pc:MetadataItem processingStep about the segmentation
operation.
5 years ago
Gerber, Mike 4fb3e70ef6 🧹 sbb_textline_detector: Do not create empty/space-only TextEquivs (again) 5 years ago
Gerber, Mike bf41a29e7b 🐛 sbb_textline_detector: Do not hardcode Created/LastChange elements 5 years ago
Gerber, Mike fbd21cdb81 🧹 sbb_textline_detector: Do not create empty/space-only TextEquivs (again) 5 years ago