Commit graph

126 commits

Author SHA1 Message Date
Clemens Neudecker
12c07f389d
Merge pull request #7 from cneud/cneud-fix-docstring
fix docstring
2019-12-06 19:44:34 +01:00
Clemens Neudecker
3b51a14600
Merge pull request #1 from cneud/add-license-1
Add LICENSE
2019-12-06 19:44:23 +01:00
Clemens Neudecker
29870f26e1
Merge pull request #4 from cneud/cneud-PAGE2019
PAGE2019
2019-12-06 19:44:04 +01:00
Clemens Neudecker
c0989f5b55
Merge pull request #3 from cneud/cneud-opencv-python-headless
Relax requirements.txt
2019-12-06 19:41:04 +01:00
Clemens Neudecker
9b784e3a81
ocrd implies click 2019-12-06 19:03:10 +01:00
Clemens Neudecker
7c7f035b69
matplotlib implies numpy 2019-12-06 19:02:50 +01:00
Clemens Neudecker
1c4ddac3b6
Merge pull request #10 from kba/kebab-snake
kebab-case snake_case executable, fix #9
2019-12-06 18:40:05 +01:00
Konstantin Baierer
b6ca1a7c53 kebab-case snake_case executable, fix #9 2019-12-06 18:26:09 +01:00
1b73c3c23e 📝 sbb_textline_detector: Break long line for ocrd_sbb_textline_detector example 2019-12-06 12:34:15 +01:00
eb4c8ee99c 📝 sbb_textline_detector: Break long line for ocrd_sbb_textline_detector example 2019-12-06 12:34:15 +01:00
b15fed32ff Merge branch 'master' of https://github.com/qurator-spk/sbb_textline_detector 2019-12-06 12:25:01 +01:00
482c0fd095 📝 sbb_textline_detector: Document OCR-D Usage 2019-12-06 11:42:23 +01:00
3935204338 📝 sbb_textline_detector: Document OCR-D Usage 2019-12-06 11:42:23 +01:00
Clemens Neudecker
3b526ef40d
refactor class name 2019-12-06 02:27:23 +01:00
Clemens Neudecker
6c0bfba686
fix typos 2019-12-06 02:21:04 +01:00
Clemens Neudecker
5113d28e13
do not require sudo 2019-12-06 00:48:38 +01:00
Clemens Neudecker
02388a759d
Update README.md 2019-12-06 00:47:53 +01:00
Clemens Neudecker
c8bc468628
fix docstring 2019-12-06 00:40:05 +01:00
Clemens Neudecker
2ecb021870
refactor class name 2019-12-06 00:35:11 +01:00
Clemens Neudecker
e696a068cb
Fix typos 2019-12-06 00:20:34 +01:00
Clemens Neudecker
d90dad48fd
PAGE2019 2019-12-05 22:24:28 +01:00
Clemens Neudecker
58f5d2b3c5
Update requirements.txt 2019-12-05 22:11:16 +01:00
Clemens Neudecker
b22a812979
Improve README.md 2019-12-05 22:06:44 +01:00
Clemens Neudecker
eb64cc030f
Create LICENSE 2019-12-05 22:01:47 +01:00
b-vr103
d08712533a Merge branch 'master' of https://github.com/qurator-spk/sbb_textline_detector 2019-12-05 16:49:27 +01:00
vahidrezanezhad
af670b55ac
Update README.md 2019-12-05 16:31:12 +01:00
vahidrezanezhad
1013b7ed64
Update README.md 2019-12-05 16:30:09 +01:00
vahidrezanezhad
eeff5a0b2d
Update README.md 2019-12-05 16:15:07 +01:00
vahidrezanezhad
fb7c605515
Update README.md 2019-12-05 16:06:55 +01:00
vahidrezanezhad
ad4f7acdd8
Update README.md 2019-12-05 15:47:26 +01:00
vahidrezanezhad
a836a083c1
Update README.md 2019-12-05 15:47:02 +01:00
vahidrezanezhad
b0dc6491c7
Update README.md 2019-12-05 15:46:36 +01:00
Rezanezhad, Vahid
19116091f9 Update config_params.json 2019-12-05 14:05:55 +01:00
af5cbe9052 🐛 sbb_textline_detector: Fix making the output file id 2019-12-04 11:42:45 +01:00
Rezanezhad, Vahid
2112bb18c6 fixed the bug: local variable 't4' referenced before assignment 2019-11-29 11:29:12 +01:00
Rezanezhad, Vahid
a11f6740cb Update main.py - robust deskewing and better page extraction 2019-11-28 16:19:44 +01:00
Rezanezhad, Vahid
0182b7087f remove multiprocessing bug 2019-11-20 14:05:15 +01:00
8fa7179560 🐛 sbb_textline_detector: Disable multiprocessing to fix race condition
Lines were sorted in the wrong regions. Work around this by disabling
multiprocessing until a proper fix is done.
2019-11-20 09:50:29 +01:00
4aed06a325 sbb_textline_detection: Preserve input PAGE info by merging segmentation results
ocrd_sbb_textline_detection used the output XML by main.py as is, and
– by doing this – threw away any input data from the input PAGE,
including the critical pc:AlternativeImage and the less important
pc:MetadataItem.

Fix this by merging the segmentation results into a file created from
the input file.

Also add a pc:MetadataItem processingStep about the segmentation
operation.
2019-11-19 15:08:53 +01:00
4fb3e70ef6 🧹 sbb_textline_detector: Do not create empty/space-only TextEquivs (again) 2019-11-19 11:08:41 +01:00
bf41a29e7b 🐛 sbb_textline_detector: Do not hardcode Created/LastChange elements 2019-11-19 11:05:18 +01:00
fbd21cdb81 🧹 sbb_textline_detector: Do not create empty/space-only TextEquivs (again) 2019-11-19 10:59:41 +01:00
Rezanezhad, Vahid
2d6dd92b31 Update main.py 2019-11-04 11:10:17 +01:00
Rezanezhad, Vahid
9f97f34255 Update main.py 2019-10-31 17:36:21 +01:00
Rezanezhad, Vahid
8c954a6c7a Update main.py 2019-10-31 17:08:35 +01:00
Rezanezhad, Vahid
6714481556 Update main.py 2019-10-31 10:54:57 +01:00
Rezanezhad, Vahid
719824f19d Update main.py 2019-10-30 13:37:54 +01:00
f94511a1d8 Merge branch 'master' of code.dev.sbb.berlin:qurator/mono-repo 2019-10-25 18:11:17 +02:00
4f28cd905a 🧹 sbb_textline_detector: Do not create empty/space-only TextEquivs
ocrd_tesserocr or ocrd_cis complain about already existing text if
empty/space-only TextEquivs elements exist after segmentation. Also, it
does not make sense to create them in a segmentation step.

Fix by removing the code generating the elements.
2019-10-25 18:08:31 +02:00
Rezanezhad, Vahid
00929ab391 Update main.py 2019-10-25 14:39:37 +02:00