Commit graph

1465 commits

Author SHA1 Message Date
kba
f58189d5f4 ci: tag eynollah docker image with git tag version if possible, else latest 2026-04-29 16:34:06 +02:00
Robert Sachunsky
bb092364af get_slopes_and_deskew_new_light2: estimate slopes here, too…
extract slopes from minimal bounding rectangles of textlines,
using heuristics on aspect ratios, lengths and angles
2026-04-24 15:27:29 +02:00
Robert Sachunsky
c478c03db4 avoid deskewed contour matching w/ -romb 2026-04-24 15:27:29 +02:00
Robert Sachunsky
998ee2ecee get_textlines_of_a_textregion_sorted: simplify 2026-04-24 15:27:29 +02:00
Robert Sachunsky
be61875d6e get_textlines_of_a_textregion_sorted: w-h instead of w/h test 2026-04-24 15:27:29 +02:00
Robert Sachunsky
9723dfeb73 writer: also annotate col-classifier result…
both notations:
- in `/PcGts/Page/@custom` (CSS-style)
- in `/PcGts/Metadata/Comment` (qurator-style)
2026-04-24 15:27:29 +02:00
Robert Sachunsky
e3720d6623 writer: also annotate page-level deskewing result 2026-04-24 15:27:29 +02:00
Robert Sachunsky
2da718f76f writer, do_work_of_slopes*: drop passing bboxes around
(needed no more)
2026-04-24 15:27:29 +02:00
Robert Sachunsky
b792324c5b do_work_of_slopes_new_curved (if angle >45°): simplify, improve…
- use new `rotate_image_enlarge` instead of
  custom (insufficient) padding w/ `rotate_image`
- get external contours instead of tree
  (without checking hierarchy afterwards)
- use largest textline contours by area instead of
  longest polygon path
- always use `separate_lines` (but without its incorrect
  angle/offset calculations) instead of `separate_lines_vertical_cont`
- calculate coordinate transformation (shift, angle)
  for all cases (including >45°)
- simplify
2026-04-24 15:27:29 +02:00
Robert Sachunsky
dbdb6d0d53 rotate: rm unused failed variants, add new rotate_image_enlarge
(correct version that enlarges canvas instead of clipping corners,
 using only OpenCV)
2026-04-24 15:27:29 +02:00
Robert Sachunsky
d257869d83 do_work_of_slopes_new_curved (if angle <45°): simplify, improve…
- use relative images, cropped to parent bbox (faster)
- no `scale` parameter (unused)
- use largest textline contours by area instead of first
- simplify
2026-04-24 15:27:29 +02:00
Robert Sachunsky
0dce1f24d2 do_work_of_slopes_new_curved: improve deskewing…
- return early if textline mask is empty
- intersect textline mask with parent mask
  (so neighbouring, truncated textlines
   will not interfere)
- fix bug when resulting angle is small:
  rather, compare with page angle
- if there is more than 1 line in the region,
  * use median instead of mean to estimate y_diff
  * if height dominates over width and x_diff
    over y_diff, then assume 90°: transpose image,
    deskew on that, then add 90° to result
- otherwise instead of just using page angle,
  try to estimate single-line angle by approximating
  slope of linear x-y regression on mask image;
  again, if height dominates over width, then
  assume +90° and use transposed image
- drop unused `scale` param
2026-04-24 15:27:29 +02:00
Robert Sachunsky
97d9b0ea50 small_textlines_to_parent_adherence2: simplify, improve…
- when merging large line with small lines,
  don't use first new contour but largest
- get external contours instead of tree
  (without checking hierarchy afterwards)
- simplify
2026-04-24 15:27:29 +02:00
Robert Sachunsky
0735cb9d2b filter_contours_without_textline_inside: also filter slopes 2026-04-24 15:27:29 +02:00
Robert Sachunsky
fa8340dbb4 -cl: also filter textregions without textlines here 2026-04-24 15:27:29 +02:00
Robert Sachunsky
4a6d3968f9 major run_single refactoring…
- rename `get_regions()` → `get_early_layout()`
- split up `run_boxes_no/full_layout()` into shared
  * `get_full_layout()` (for lapping mapping,
    table decoding and optional full model prediction)
  * `get_deskewed_masks()` (for de-rotation)
  * extraction of various region types (polygons and confidences)
  * `run_boxes_order()` (for column detection and box ordering)
- rename `contours_tables` → `polygons_of_tables`

This further reduces redundant code, avoids splitting up the same
functionality across different places depending on mode etc.
2026-04-24 15:27:29 +02:00
Robert Sachunsky
dfb40f4a49 hsep fusion: avoid zero division if zero overlap 2026-04-24 15:27:29 +02:00
Robert Sachunsky
b63e073121 skip deskewing if no textlines 2026-04-24 15:27:29 +02:00
Robert Sachunsky
7b5aa2a1f6 more run_single refactoring…
- `run_single`: re-use `return_contours_of_interested_region`
  for extraction and filtering of text region contours
- `run_single`: isolate new function `match_deskewed_contours`
- `run_single`: apply dilation afterwards
- rename `contours_only_text_parent_d_ordered` → `polygons_of_textregions_d`
- rename `contours_only_text_parent` → `polygons_of_textregions`
- rename `contours_only_text_parent_h` → `polygons_of_textregions_h`
- `do_work_of_slopes_new_curved` and `get_slopes_and_deskew_new_curved`:
   no need for `mask_texts_only` array arg
- `filter_contours_inside_a_bigger_one`: no need for `image` as array arg,
  simplify
- `split_textregion_main_vs_head`: simplify, re-order arguments
  and return tuple logically
- if no main text regions are found, just convert marginals to main text
  and continue normally instead of stopping early w/ empty marginals (i.e.
  no textlines)
2026-04-24 15:27:29 +02:00
Robert Sachunsky
a2f43b8d69 simplify, add confidence for headings as well 2026-04-23 21:14:39 +02:00
Robert Sachunsky
264b00f8ab predictor: cache models' input shape instead of output shape 2026-04-23 21:14:39 +02:00
Robert Sachunsky
829256df91 do_prediction*: remove autosized variants, simplify 2026-04-23 21:14:39 +02:00
Robert Sachunsky
de65a55a04 mbro: simplify, add drop-caps as well, reduce batch size…
- do_order_of_regions_with_model:
  * add `polygons_of_drop_capitals`, order these indices as well
    (model was not trained for this, but it works)
  * explicit label identifiers instead of number literals
  * map marginals and images correctly
  * simplify (a lot)
  * reduce inference batch size to accomodate 8 GB VRAM GPUs
- return_indexes_of_contours_located_inside_another_list_of_contours:
  simplify
2026-04-23 21:14:39 +02:00
Robert Sachunsky
0dfc9d911f run_boxes_no_full_layout: also map to fl labels here…
(because -mbro assumes the label set from -fl)
2026-04-20 18:20:58 +02:00
Robert Sachunsky
0015f2675b with -slro, also extract and apply page (Border) mask 2026-04-20 18:20:58 +02:00
Robert Sachunsky
569b96d1a9 find_number_of_columns_in_document: pass correct label_seps…
- in fl: 6
- non-fl: 3 (now fixed)
2026-04-20 18:20:58 +02:00
Robert Sachunsky
f28a9c9e0b add confidence for all region types, prepare for textlines…
- pass on probabilities from predicted class everywhere
- rename `confidence_matrix` → `confidence_regions` / `regions_confidence`
- rename `get_textregion_confidences()` → `get_region_confidences()`
- add same for tables, textlines and regionsfl (full layout model)
- aggregate per-region confidence lists for image, table, drop-capital,
  left marginal and right marginal regions
- add in writer
- simplify/re-indent some
- try to replace more number literals with class label identifiers
2026-04-20 18:20:58 +02:00
Robert Sachunsky
1164b97917 extract_text_regions_new: fix heading thresholding…
- re-introduce boosting `heading` thresholding broken
  when refactoring (light version and do_prediction)
- also return confidence for full layout prediction
2026-04-20 18:20:58 +02:00
Robert Sachunsky
20dc5c3188 also cover drop-capital in (heuristic) reading order 2026-04-20 18:20:58 +02:00
Robert Sachunsky
92e94753c7 decoding of dropcaps in -fl: ensure consistency w/ early layout…
1. use connected component analysis to get unique segments
   in early prediction result
2. for each drop-capital segment in full prediction result,
   find matching early segment
3. when they have high overlap, assign drop-capital label
   to the entire early segment
2026-04-20 18:20:58 +02:00
Robert Sachunsky
29b42fdfaa decoding of drop-capitals in full layout: also allow replacing img…
- rename `putt_bb_of_drop_capitals_of_model_in_patches_in_layout`
  → `fill_bb_of_drop_capitals`
- also allow image (besides text) label in early layout prediction
  result when checking if entire bbox can be filled (as opposed to
  just drop-capital | image | background mask)
- simplify
2026-04-16 18:37:27 +02:00
Robert Sachunsky
6e0aed35f4 run_boxes_*: simplify, document class label mappings, start using
identifier constants instead of literals for labels
2026-04-16 18:37:27 +02:00
Robert Sachunsky
f29e876a7c return_boxes_of_images_by_order_of_reading_new: sep label differs w/o -fl…
fix bug where in non-full mode, the wrong class label was assumed
for separator regions (3 in non- vs 6 in full layout mode):

- pass in separator mask instead of full segmentation map
- rename for clarity:
  - `regions_without_separators` → `text_mask` (alread binary)
  - `regions_with_separators` → `sep_mask` (now just binary)
2026-04-16 05:16:23 +02:00
Robert Sachunsky
f5f2435a38 run_marginals: drop unnecessarily passing textline_mask, mask_seps, mask_images 2026-04-16 05:13:06 +02:00
Robert Sachunsky
9309586712 split_textregion_main_vs_header → split_textregion_main_vs_head…
(and simplify)
2026-04-16 05:07:22 +02:00
Robert Sachunsky
0f82b568ba do_prediction_new_concept: aggregate confidence for all classes…
(not just text; will still have to pass that on to the writer...)
2026-04-16 05:02:20 +02:00
Robert Sachunsky
5a27e46b22 keep seps over artificial boundaries to improve col separation…
(thresholding and decoding with artificial boundary class can
 overwrite existing column separators, which in turn can contribute
 to missing column boundaries; this prioritises seps over boundaries,
 which does not impair separation of instances, as seps will separate
 text/image/etc instances just as well as artificial boundaries)
2026-04-16 04:56:38 +02:00
Robert Sachunsky
9d6ff65e1d get_tables_from_model: utilise artificial bound thresholding…
(to improve separation of neighbouring tables, esp. across
 columns; since model's threshold class is particularly weak,
 also use lower threshold here)
2026-04-16 04:49:07 +02:00
Robert Sachunsky
12b1271487 layout cli: add option --halt-fail 2026-04-13 01:19:47 +02:00
Robert Sachunsky
56e6deb02c predictor: jit-compile and precompile (non-autosized) models 2026-04-13 01:17:04 +02:00
Robert Sachunsky
01c54eb2ef reduce inference batch sizes to accommodate 8 GB VRAM
(still pending a solution for flexible batch sizes)
2026-04-13 01:15:25 +02:00
Robert Sachunsky
f44c39667e predictor: disable rebatching (until we have flexible batch sizes) 2026-04-13 01:14:49 +02:00
Robert Sachunsky
219954d15b predictor: use predict_on_batch instead of predict 2026-04-13 01:14:18 +02:00
Robert Sachunsky
0d21b62aee disable autosized prediction entirely (also for _patched)…
When 338c4a0e wrapped all prediction models for automatic
image size adaptation in CUDA,
- tiling (`_patched`) was indeed faster
- whole  (`_resized`) was actually slower

But CUDA-based tiling also increases GPU memory requirements
a lot. And with the new parallel subprocess predictors, Numpy-
based tiling is not necessarily slower anymore.
2026-04-10 18:23:10 +02:00
Robert Sachunsky
ccef63f08b get_regions: always use resized/enhanced image…
(avoid strange image handling short-cut, which uses
 early cropped image used for column classification
 instead of normal image in 1/2-column cases;
 fixes accuracy issues of region_1_2 model on these images)
2026-04-10 18:17:51 +02:00
Robert Sachunsky
04da66ed73 training: plot only ~ 1000 training and ~ 100 validation images 2026-03-30 13:34:05 +02:00
Robert Sachunsky
a8556f5210 run: sort parallel log messages by file name instead of prefixing…
(as follow-up to ec08004f:)

- create log queues and QueueListener separately for each job
- receive job logs sequentially
- drop log filter mechanism (prefixing log messages by file name)
- also count ratio of successful jobs
2026-03-30 13:18:40 +02:00
Robert Sachunsky
1756443605 fixup device sel 2026-03-16 15:35:07 +01:00
Robert Sachunsky
6bbdcc39ef CLI/Eynollah.setup_models/ModelZoo.load_models: add device option/kwarg
allow setting device specifier to load models into

either
- CPU or single GPU0, GPU1 etc
- per-model patterns, e.g. col*:CPU,page:GPU0,*:GPU1

pass through as kwargs until `ModelZoo.load_models()` setup up TF
2026-03-15 04:54:04 +01:00
Robert Sachunsky
67e9f84b54 do_prediction* for "col_classifier": pass array as float16 instead of float64 2026-03-15 03:20:39 +01:00