kba
4772fd17e2
missed changing override mechanism in eynollah_ocr
2025-10-29 20:47:13 +01:00
kba
29c273685f
fix merge issues
2025-10-29 20:15:19 +01:00
kba
de76eabc1d
Merge branch 'cli-logging' into model-zoo
2025-10-29 19:41:01 +01:00
kba
5e22e9db64
model_zoo: make type str to reduce importing overhead
2025-10-29 19:16:35 +01:00
kba
a913bdf7dc
make --model-basedir and --model-overrides top-level CLI options
2025-10-29 18:48:41 +01:00
kba
b6f82c72b9
refactor cli tests
2025-10-29 17:23:21 +01:00
kba
ef999c8f0a
Merge branch 'model-zoo' of lx0145.sbb.spk-berlin.de:/data/eynollah into model-zoo
2025-10-27 11:45:20 +01:00
kba
294b6356d3
wip
2025-10-27 11:45:16 +01:00
kba
51d2680d9c
wip
2025-10-27 11:44:59 +01:00
kba
ec1fd93dad
wip
2025-10-23 11:58:23 +02:00
kba
883546a6b8
eynollah models package
2025-10-22 17:05:40 +02:00
kba
04bc4a63d0
reorganize model_zoo
2025-10-22 16:04:48 +02:00
kba
d94285b3ea
rewrite model spec data structure
2025-10-22 13:07:35 +02:00
kba
146658f026
eynollah layout: fix trocr_processor model_zoo call
2025-10-22 10:48:26 +02:00
kba
4c8abfe19c
eynollah_ocr: actually replace the model calls
2025-10-22 10:48:26 +02:00
kba
1337461d47
adopt image_enhancer to the zoo
2025-10-21 19:24:55 +02:00
kba
f0c86672f8
adopt mb_ro_on_layout to the zoo
2025-10-21 17:55:08 +02:00
kba
bcffa2e503
adopt binarizer to the zoo
2025-10-21 17:53:24 +02:00
kba
a53d5fc452
update docs/makefile to point to v0.6.0 models
2025-10-21 13:15:57 +02:00
kba
c6b863b13f
typing and asserts
2025-10-21 12:05:27 +02:00
kba
44b75eb36f
cli: model -> model_basedir
2025-10-21 11:05:12 +02:00
kba
062f317d2e
Introduce model_zoo to Eynollah_ocr
2025-10-20 21:14:52 +02:00
kba
d609a532bf
organize imports mostly
2025-10-20 19:46:07 +02:00
kba
48d1198d24
move Eynollah_ocr to separate module
2025-10-20 19:15:31 +02:00
kba
a850ef39ea
factor model loading in Eynollah to EynollahModelZoo
2025-10-20 18:34:44 +02:00
kba
6c89888166
Refactor CLI for consistent logging and late imports
2025-10-17 17:47:59 +02:00
kba
38c028c6b5
📦 v0.6.0
2025-10-17 10:36:30 +02:00
kba
76c13bcfd7
Merge branch 'integrate-training-from-sbb_pixelwise_segmentation' of https://github.com/qurator-spk/eynollah into integrate-training-from-sbb_pixelwise_segmentation
2025-10-16 20:50:24 +02:00
kba
af5abb77fd
Merge branch 'main' into integrate-training-from-sbb_pixelwise_segmentation
2025-10-16 20:50:16 +02:00
Robert Sachunsky
948c8c3441
join_polygons: try to catch rare case of MultiPolygon
2025-10-15 16:58:17 +02:00
kba
f485dd4181
📦 v0.6.0rc2
2025-10-14 16:10:50 +02:00
kba
7daa0a1bd5
Merge branch 'fix-196' into prepare-v0.6.0rc2
2025-10-14 14:52:36 +02:00
Robert Sachunsky
8299e7009a
setup_models: avoid unnecessarily loading region_fl
2025-10-14 14:27:32 +02:00
Robert Sachunsky
e8b7212f36
polygon2contour: avoid uint for coords
...
(introduced in a433c736 to make consistent with
`filter_contours_area_of_image`, but actually
np.uint is prone to create overflows downstream)
2025-10-14 14:27:26 +02:00
kba
745cf3be48
XML encoding should be utf-8 not utf8
...
... and should use OCR-D's generateDS PAGE API consistently
2025-10-10 16:39:17 +02:00
kba
2056a8bdb9
📦 v0.6.0rc1
2025-10-10 16:32:47 +02:00
Robert Sachunsky
4e9a1618c3
layout: refactor model setup, allow loading custom versions
...
- simplify definition of (defaults for) model versions
- unify loading of loadable models (depending on mode)
- use `self.models` dict instead of `self.model_*` attributes
- add `model_versions` kwarg / `--model_version` CLI option
2025-10-10 03:18:09 +02:00
Robert Sachunsky
c4cb16c2a8
simplify
...
(`skip_layout_and_reading_order` is already an attr)
2025-10-09 23:05:50 +02:00
Robert Sachunsky
ecb53056f2
Merge branch 'main' of https://github.com/qurator-spk/eynollah into loky-with-shm-for-175-rebuilt
2025-10-09 22:54:11 +02:00
Robert Sachunsky
b3d29bef89
return_contours_of_interested_region*: rm unused variants
2025-10-09 20:14:11 +02:00
Robert Sachunsky
8a2d682e12
fix identifier scope in layout OCR options (w/o full_layout)
2025-10-09 20:14:11 +02:00
Robert Sachunsky
096def1e9d
mbreorder/enhancment: fix missing imports
...
(not sure if these models really need that, though)
2025-10-09 20:14:11 +02:00
Robert Sachunsky
027b87d321
fixup c0137c2 (missing arguments for utils_ocr)
2025-10-09 20:14:11 +02:00
Robert Sachunsky
1d4815b48f
utils_ocr: forgot to pass coordinate offsets
2025-10-09 20:14:11 +02:00
Robert Sachunsky
5e11a68a3e
writer/run_single: consistent kwarg naming conf_contours_textregion(s)
2025-10-09 20:14:11 +02:00
Robert Sachunsky
75823f9bed
run_single: call writer.build_pagexml_no_full_layout w/ kwargs
2025-10-09 20:14:11 +02:00
Robert Sachunsky
cbbb3248c7
writer: simplify
...
- `build_pagexml_no_full_layout`: delegate to
`build_pagexml_full_layout` (removing redundant code)
2025-10-09 20:14:11 +02:00
Robert Sachunsky
e32479765c
writer: simplify
...
- simplify serialization of coordinates
- re-use `serialize_lines_in_region` (drop `*_in_dropcapital` and `*_in_marginal`)
- re-use `calculate_polygon_coords`
2025-10-09 20:14:11 +02:00
Robert Sachunsky
d88ca18eec
get/do_work_of_slopes etc.: reduce call/return signatures
...
- `get_textregion_contours_in_org_image_light`: no more need
to also return unchanged contours here (see 41cc38c5 ); therefore
- `txt_con_org`: no more need for this
(now mere alias to `contours_only_text_parent`); also
- `index_by_text_par_con`: no more need for this (see prev. commit),
so do not pass/return
- `get_slopes_and_deskew_*`: do not pass `contours_only_text`
(where not used)
- `get_slopes_and_deskew_*`: do not return unchanged contours, boxes
- `do_work_of_slopes_*`: adapt respectively
2025-10-09 20:14:11 +02:00
Robert Sachunsky
02a347a48a
no more need to rm from contours_only_text_parent_d_ordered now
2025-10-09 20:14:11 +02:00