Commit graph

  • 5a0e4c3b0f find_number_of_columns_in_document: improve splitter rule Robert Sachunsky 2025-10-20 13:36:10 +02:00
  • 542d38ab43 find_number_of_columns_in_document: simplify, rename lineseps Robert Sachunsky 2025-10-20 13:34:56 +02:00
  • d3d599b010 order_of_regions: add better plotting (but commented out) Robert Sachunsky 2025-10-20 13:27:23 +02:00
  • c43a825d1d order_of_regions: filter out-of-image peaks Robert Sachunsky 2025-10-20 13:26:01 +02:00
  • 48761c3e12 find_num_col: simplify, add better plotting (but commented out) Robert Sachunsky 2025-10-20 13:20:12 +02:00
  • 184927fb54 find_num_cols: re-sort peaks when cutting n-best num_col_classifier Robert Sachunsky 2025-10-20 13:16:57 +02:00
  • 086c1880ac binarization: add option --overwrite, skip existing outputs Robert Sachunsky 2025-10-15 12:24:21 +02:00
  • c8455370a9 updating heuristics and ocr documentation vahidrezanezhad 2025-10-20 15:13:45 +02:00
  • 3ec5ceb22e
    Update flowchart vahidrezanezhad 2025-10-20 14:55:14 +02:00
  • 9d2dbb8388 updating model based reading orde detection vahidrezanezhad 2025-10-20 14:47:55 +02:00
  • 496a0e2ca4 readme and documentation updates cneud 2025-10-17 19:19:26 +02:00
  • f212ffa22d remove unnecessary backslash cneud 2025-10-17 18:27:18 +02:00
  • 9733d575bf replace list declaration with list literal (faster) cneud 2025-10-17 18:21:49 +02:00
  • 20a95365c2 remove redundant parentheses cneud 2025-10-17 18:19:00 +02:00
  • 2a1f892d72 expand keywords and supported Python versions cneud 2025-10-17 18:17:41 +02:00
  • 6c89888166 Refactor CLI for consistent logging and late imports kba 2025-10-17 17:47:59 +02:00
  • 557fb227f3 training/gt_gen_utils: fix type errors, comment out dead code ruff-training kba 2025-10-17 14:21:05 +02:00
  • af74890b2e training/inference.py: add typing info, organize imports kba 2025-10-17 14:07:43 +02:00
  • 3a73ccca2e training/models.py: make imports explicit kba 2025-10-17 13:45:14 +02:00
  • 38c028c6b5 📦 v0.6.0 v0.6.0 kba 2025-10-17 10:36:30 +02:00
  • ca8edb35e3 📝 changelog kba 2025-10-17 10:35:13 +02:00
  • 50e8b2c266 Merge branch 'integrate-training-from-sbb_pixelwise_segmentation' kba 2025-10-17 10:33:04 +02:00
  • 46d25647f7 📝 changelog kba 2025-10-16 20:46:03 +02:00
  • 2ac01ecacc join_polygons: try to catch rare case of MultiPolygon Robert Sachunsky 2025-10-15 16:58:17 +02:00
  • 2e0fb64dcb disable ruff check for training code for now kba 2025-10-16 21:29:37 +02:00
  • 76c13bcfd7 Merge branch 'integrate-training-from-sbb_pixelwise_segmentation' of https://github.com/qurator-spk/eynollah into integrate-training-from-sbb_pixelwise_segmentation kba 2025-10-16 20:50:24 +02:00
  • af5abb77fd Merge branch 'main' into integrate-training-from-sbb_pixelwise_segmentation kba 2025-10-16 20:50:16 +02:00
  • d2f0a43088 📝 changelog kba 2025-10-16 20:46:03 +02:00
  • 3bd3faef68
    Merge pull request #193 from qurator-spk/training-installation Konstantin Baierer 2025-10-16 20:39:17 +02:00
  • 2e0c1868e0 move models.py back to src/.../training old-pr-16 kba 2025-10-16 20:36:16 +02:00
  • b67a3c4ed4 tf.keras version that allows any input resolution H.T. Kruitbosch 2024-01-11 19:04:42 +01:00
  • 662aa67dfb move models.py to root to cherry-pick 3098700 kba 2025-10-16 20:31:48 +02:00
  • ad53ea3ae1 move train.py back ReduceLROnPlateau kba 2025-10-16 20:20:41 +02:00
  • 54132a499a Merge remote-tracking branch 'pixelwise_local/ReduceLROnPlateau' into ReduceLROnPlateau kba 2025-10-16 20:20:06 +02:00
  • 30fe51f3ae move src/.../train.py to root to accomodate old PR kba 2025-10-16 20:05:00 +02:00
  • 1e66c85222 Merge branch 'integrate-training-from-sbb_pixelwise_segmentation' into training-installation kba 2025-10-16 16:18:02 +02:00
  • bd8c8bfeac training: pin numpy to <1.24 as well kba 2025-10-16 16:15:31 +02:00
  • 948c8c3441 join_polygons: try to catch rare case of MultiPolygon Robert Sachunsky 2025-10-15 16:58:17 +02:00
  • f485dd4181 📦 v0.6.0rc2 v0.6.0rc2 kba 2025-10-14 16:10:50 +02:00
  • c1f0158806 📝 changelog kba 2025-10-14 14:53:15 +02:00
  • 7daa0a1bd5 Merge branch 'fix-196' into prepare-v0.6.0rc2 kba 2025-10-14 14:52:36 +02:00
  • 2febf53479 📝 changelog kba 2025-10-14 14:52:31 +02:00
  • 8299e7009a setup_models: avoid unnecessarily loading region_fl Robert Sachunsky 2025-10-14 14:23:29 +02:00
  • e8b7212f36 polygon2contour: avoid uint for coords Robert Sachunsky 2025-10-14 14:16:39 +02:00
  • 745cf3be48 XML encoding should be utf-8 not utf8 kba 2025-10-10 16:39:16 +02:00
  • 2056a8bdb9 📦 v0.6.0rc1 v0.6.0rc1 kba 2025-10-10 16:32:47 +02:00
  • 34f5996194 makefile: update models kba 2025-10-10 13:02:14 +02:00
  • 09195aeee9 Merge remote-tracking branch 'bertsky/loky-with-shm-for-175-rebuilt' into prepare-v0.6.0 kba 2025-10-10 12:49:14 +02:00
  • 4e9a1618c3 layout: refactor model setup, allow loading custom versions Robert Sachunsky 2025-10-10 03:18:09 +02:00
  • 374818de11 📝 update changelog for 5725e4f Robert Sachunsky 2025-10-09 23:11:05 +02:00
  • c4cb16c2a8 simplify Robert Sachunsky 2025-10-09 23:05:50 +02:00
  • ecb53056f2 Merge branch 'main' of https://github.com/qurator-spk/eynollah into loky-with-shm-for-175-rebuilt Robert Sachunsky 2025-10-09 22:54:11 +02:00
  • d96af425a7
    Merge pull request #4 from bertsky/loky-with-shm-for-175-rebuilt-refactored Robert Sachunsky 2025-10-09 22:18:53 +02:00
  • cab392601e 📝 update changelog Robert Sachunsky 2025-10-09 20:12:06 +02:00
  • e1b56d97da CI: lint with ruff Robert Sachunsky 2025-10-08 17:54:38 +02:00
  • a144026b27 add rough ruff config Robert Sachunsky 2025-10-08 15:13:57 +02:00
  • b3d29bef89 return_contours_of_interested_region*: rm unused variants Robert Sachunsky 2025-10-08 19:21:07 +02:00
  • 8a2d682e12 fix identifier scope in layout OCR options (w/o full_layout) Robert Sachunsky 2025-10-08 16:52:22 +02:00
  • 096def1e9d mbreorder/enhancment: fix missing imports Robert Sachunsky 2025-10-08 15:13:13 +02:00
  • 027b87d321 fixup c0137c2 (missing arguments for utils_ocr) Robert Sachunsky 2025-10-08 14:56:57 +02:00
  • 1d4815b48f utils_ocr: forgot to pass coordinate offsets Robert Sachunsky 2025-10-08 14:56:14 +02:00
  • 839b7c4d84 make models: avoid re-download Robert Sachunsky 2025-10-08 12:33:14 +02:00
  • e5b5264568 CI: add diagnostic message for model symlink Robert Sachunsky 2025-10-08 12:17:53 +02:00
  • ca72a095ca tests: cover table detection in various modes Robert Sachunsky 2025-10-08 00:44:32 +02:00
  • 5e11a68a3e writer/run_single: consistent kwarg naming conf_contours_textregion(s) Robert Sachunsky 2025-10-08 01:03:48 +02:00
  • 75823f9bed run_single: call writer.build_pagexml_no_full_layout w/ kwargs Robert Sachunsky 2025-10-08 00:54:53 +02:00
  • cbbb3248c7 writer: simplify Robert Sachunsky 2025-10-08 00:43:29 +02:00
  • e32479765c writer: simplify Robert Sachunsky 2025-10-07 23:03:27 +02:00
  • d88ca18eec get/do_work_of_slopes etc.: reduce call/return signatures Robert Sachunsky 2025-10-07 22:53:30 +02:00
  • 02a347a48a no more need to rm from contours_only_text_parent_d_ordered now Robert Sachunsky 2025-10-07 22:47:34 +02:00
  • fd43e78442 filter_contours_without_textline_inside: simplify Robert Sachunsky 2025-10-07 22:42:36 +02:00
  • 0a80cd5dff avoid unnecessary 3-channel conversions: for tables, too Robert Sachunsky 2025-10-07 22:37:05 +02:00
  • dfdc705375 do_work_of_slopes: rm unused old variant Robert Sachunsky 2025-10-07 22:33:06 +02:00
  • 2e907875c1 get_text_region_boxes_by_given_contours: simplify Robert Sachunsky 2025-10-07 22:32:06 +02:00
  • d53f829dfd filter_contours_inside_a_bigger_one: fix edge case in 81827c29 Robert Sachunsky 2025-10-07 22:06:57 +02:00
  • 18bbdb7c48 CI: run deps-test with OCR extra so symlink rule fires Robert Sachunsky 2025-10-07 00:54:25 +02:00
  • 23535998f7 tests: symlink OCR models into layout model directory Robert Sachunsky 2025-10-06 21:27:21 +02:00
  • a1904fa660 tests: cover layout with OCR in various modes Robert Sachunsky 2025-10-06 17:44:12 +02:00
  • 595ed02743 run_single: simplify; allow running TrOCR in non-fl mode, too Robert Sachunsky 2025-10-06 17:24:50 +02:00
  • 6e57ab3741 textline_contours_postprocessing: do not catch arbitrary exceptions Robert Sachunsky 2025-10-06 16:53:59 +02:00
  • fe603188f4 avoid unnecessary 3-channel conversions Robert Sachunsky 2025-10-06 13:11:03 +02:00
  • 155b8f68b8 matching deskewed text region contours with predicted: improve Robert Sachunsky 2025-10-06 12:58:24 +02:00
  • 0e00d7868b matching deskewed text region contours with predicted: improve Robert Sachunsky 2025-10-06 12:55:10 +02:00
  • 0f33c21eb3 matching deskewed text region contours with predicted: improve Robert Sachunsky 2025-10-05 02:45:01 +02:00
  • 73e5a1def8 matching deskewed text region contours with predicted: simplify Robert Sachunsky 2025-10-05 02:33:03 +02:00
  • d774a23daa matching deskewed text region contours with predicted: simplify Robert Sachunsky 2025-10-05 02:18:17 +02:00
  • 29b4527bde do_order_of_regions: simplify Robert Sachunsky 2025-10-03 02:06:08 +02:00
  • e674ea08f3 do_order_of_regions: drop redundant no/full_layout Robert Sachunsky 2025-10-03 00:59:25 +02:00
  • e9bb62bd86 do_order_of_regions: simplify Robert Sachunsky 2025-10-02 23:44:00 +02:00
  • 7387f5a929 do_order_of_regions: improve box matching, simplify Robert Sachunsky 2025-10-02 22:35:40 +02:00
  • 4950e6bd78 order_of_regions: simplify Robert Sachunsky 2025-10-02 22:28:52 +02:00
  • a1c8fd4467 do_order_of_regions / order_of_regions: simplify Robert Sachunsky 2025-10-02 21:41:37 +02:00
  • 415b2cbad8 eynollah, drop_capitals: simplify Robert Sachunsky 2025-10-02 21:36:22 +02:00
  • 3f3353ec3a do_order_of_regions: simplify Robert Sachunsky 2025-10-02 21:28:04 +02:00
  • 8c3d5eb0eb separate_marginals_to_left_and_right_and_order_from_top_to_down: simplify Robert Sachunsky 2025-10-02 21:07:35 +02:00
  • ff40f06bca Merge branch 'main' into prepare-v0.6.0 kba 2025-10-09 14:05:29 +02:00
  • 8215814a3f Merge branch 'changelog-v0.5.0' kba 2025-10-09 14:03:45 +02:00
  • 4ffe6190d2 📝 changelog kba 2025-10-09 14:03:26 +02:00
  • 8869c20c33 updating CHANGELOG for v0.5.0 vahidrezanezhad 2025-10-06 14:53:47 +02:00
  • 584cde7eb8 updating CHANGELOG for v0.5.0 updating_CHANGELOG_v0.5.0 vahidrezanezhad 2025-10-06 14:53:47 +02:00