Commit graph

  • 6241530293
    Merge dbe06867a6 into 0f410c2e7c Konstantin Baierer 2025-12-10 13:24:44 +00:00
  • dbe06867a6 wip: remove textline_light=True from call to EynollahXmlWriter reduce-complexity-rebased kba 2025-12-10 14:24:32 +01:00
  • 58000069cf Restore correct execution of export_textline_images_and_text vahidrezanezhad 2025-12-03 15:40:52 +01:00
  • 5716262629 Fix eynollah ocr --help so it works again vahidrezanezhad 2025-12-03 14:11:47 +01:00
  • 86d437b77b Restored correct functionality of the extract_only_images mode and cleaned up the argument handling vahidrezanezhad 2025-12-03 12:01:42 +01:00
  • 4175b52768 log to STDERR not STDOUT kba 2025-12-02 15:00:33 +01:00
  • 04d21b9d92 🔥 refactor eynollah ocr kba 2025-11-28 14:54:43 +01:00
  • 244847e3d4 move line-gt extraction out of ocr to eynollah-training kba 2025-11-28 12:09:50 +01:00
  • 058478baf3 CI: do not upgrade (now-unpineed) torch kba 2025-11-28 15:03:06 +01:00
  • fcd87fc3cf 💀 remove dead code from eynollah.py kba 2025-12-10 13:14:32 +01:00
  • 1eef5514d7 eynollah.py: fix kwargs to writer kba 2025-11-28 10:50:50 +01:00
  • b7d3a6724b enforce kwargs for writer.build_... kba 2025-11-27 12:43:45 +01:00
  • 97959869ba remove more branches after textline_light default true kba 2025-11-27 11:30:00 +01:00
  • 5d497b0f72 factor out extract_only_images as eynollah extract-images kba 2025-11-26 21:35:45 +01:00
  • b10773aae6 🔥 replace light_version/textline_light with True kba 2025-12-10 12:56:01 +01:00
  • 4fc3ff33cb The cnn-rnn ocr model can be trained now adding-cnn-rnn-training-script vahidrezanezhad 2025-12-09 17:22:12 +01:00
  • 84a72a128b cnn-rnn model can be called - model input height and width are dynamic now - data generator is also callable vahidrezanezhad 2025-12-09 15:30:19 +01:00
  • 59e5a73654 adding cnn-rnn training script vahidrezanezhad 2025-12-08 19:30:57 +01:00
  • e715b5d960
    Merge 7bf5e077d9 into 0f410c2e7c Konstantin Baierer 2025-12-03 14:41:05 +00:00
  • 7bf5e077d9 Restore correct execution of export_textline_images_and_text reduce-complexity vahidrezanezhad 2025-12-03 15:40:52 +01:00
  • 6ac37af2f8 Fix eynollah ocr --help so it works again vahidrezanezhad 2025-12-03 14:11:47 +01:00
  • d687d862d6 Restored correct functionality of the extract_only_images mode and cleaned up the argument handling vahidrezanezhad 2025-12-03 12:01:42 +01:00
  • 92897c5f4b
    Merge 9fdae72e96 into 38c028c6b5 Robert Sachunsky 2025-12-03 02:05:39 +00:00
  • 9fdae72e96 utils_ocr.return_textline_contour: gen cv2-like contours (w/ ndim=3, as in all other places) Robert Sachunsky 2025-12-03 03:04:46 +01:00
  • ad8f8167c2 separate_lines/_vertical: gen cv2-like contours (w/ ndim=3, as in all other places) Robert Sachunsky 2025-12-03 00:58:26 +01:00
  • 43a95842bd writer: also ensure validity after scaling Robert Sachunsky 2025-12-02 16:35:32 +01:00
  • 51abe9617a log to STDERR not STDOUT kba 2025-12-02 15:00:33 +01:00
  • 56e73bf72f deskewing: add a 2nd stage for precision Robert Sachunsky 2025-11-28 18:27:58 +01:00
  • adcea47bc0 return_boxes_of_images_by_order_of_reading_new: always erode Robert Sachunsky 2025-11-28 18:23:59 +01:00
  • 5a3de3b42d column detection: improve, aided by vseps whenever possible Robert Sachunsky 2025-11-28 18:14:24 +01:00
  • 4dd40c542b find_num_col: add optional criterion - sum of vertical separators Robert Sachunsky 2025-11-28 18:07:15 +01:00
  • 84d10962f3 return_boxes_of_images_by_order_of_reading_new: improve Robert Sachunsky 2025-11-28 18:04:12 +01:00
  • 5abf0c1097 return_boxes_of_images_by_order_of_reading_new: improve Robert Sachunsky 2025-11-28 17:58:44 +01:00
  • b71bb80e3a return_boxes_of_images_by_order_of_reading_new: fix 4abc2ff5 Robert Sachunsky 2025-11-28 17:53:27 +01:00
  • a527d7a10d combine_hor_lines_and_delete_cross_points: improve Robert Sachunsky 2025-11-28 17:34:11 +01:00
  • 5c12b6a851 combine_hor_lines_and_delete_cross_points: simplify and rename Robert Sachunsky 2025-11-28 17:27:12 +01:00
  • 06cb9d1d31 combine_hor_lines_and_delete_cross_points: fix 1-off px bug Robert Sachunsky 2025-11-28 17:08:39 +01:00
  • 38d91673b1 combine_hor_lines_and_delete_cross_points: get external contours Robert Sachunsky 2025-11-28 16:50:08 +01:00
  • ee59a6809d contours_in_same_horizon: fix 5d15941b Robert Sachunsky 2025-11-28 16:17:09 +01:00
  • b161e33854 🔥 refactor eynollah ocr kba 2025-11-28 14:54:43 +01:00
  • 30f9c695dc move line-gt extraction out of ocr to eynollah-training kba 2025-11-28 12:09:50 +01:00
  • 951bd2fce6 CI: do not upgrade (now-unpineed) torch kba 2025-11-28 15:03:06 +01:00
  • 9bcfeab057 💀 remove dead code from eynollah.py kba 2025-11-28 10:46:47 +01:00
  • 5171e09c2d eynollah.py: fix kwargs to writer kba 2025-11-28 10:50:50 +01:00
  • c24cf94bce enforce kwargs for writer.build_... kba 2025-11-27 12:43:45 +01:00
  • acb91efe48 WIP: reorganize OCR-D and start on ocrd-eynollah-ocr ocrd-restructure kba 2025-11-28 11:15:53 +01:00
  • 4aa9543a7d remove more branches after textline_light default true kba 2025-11-27 11:30:00 +01:00
  • 177d555ded factor out extract_only_images as eynollah extract-images kba 2025-11-26 21:35:45 +01:00
  • 83e8b289da 🔥 drop light_version/textline_light (now default and implied) kba 2025-11-26 20:29:29 +01:00
  • ca83cf934d fix imports from src/cli/cli_*/*_cli kba 2025-11-26 20:48:14 +01:00
  • 095b36c389 models: split into layout, extra and ocr kba 2025-11-26 19:45:58 +01:00
  • 000af16a47 🔥 remove torch pinning kba 2025-11-26 19:23:49 +01:00
  • e503c1a0b7 drop obsolete multi-model binarization kba 2025-11-26 18:19:03 +01:00
  • 82266f8234 reorganize cli kba 2025-11-26 18:42:39 +01:00
  • 5a1900e664 🔥 remove OCR option from eynollah layout kba 2025-11-26 15:34:36 +01:00
  • d4bd72c2ae
    Merge 0f410c2e7c into 38c028c6b5 Konstantin Baierer 2025-11-26 15:38:01 +00:00
  • 0f410c2e7c disable tf/keras logging on first import model-zoo kba 2025-11-26 16:37:54 +01:00
  • 9d9d32daed update OCR-D bindings kba 2025-11-26 16:20:27 +01:00
  • 103c007368 . kba 2025-11-26 14:37:00 +01:00
  • 0149147e95 . kba 2025-11-25 13:45:47 +01:00
  • 5032b54f98
    Merge 42a3cc2335 into 38c028c6b5 Robert Sachunsky 2025-11-18 19:05:29 +01:00
  • e428e7ad78 ensure separators stay within image bounds Robert Sachunsky 2025-11-16 12:17:29 +01:00
  • 406288b1fe fixup 72d059f3: forgot to update other writer calls Robert Sachunsky 2025-11-15 20:13:58 +01:00
  • 028ed16921 adapt ocrd-sbb-binarize Robert Sachunsky 2025-11-15 17:17:37 +01:00
  • 49ab269e08 fix typos found by ruff Robert Sachunsky 2025-11-15 15:46:08 +01:00
  • 72d059f3c9 reading order: simplify assignment / counting Robert Sachunsky 2025-11-15 14:34:12 +01:00
  • 5a778003fd contour matching for deskewed image: ensure matches for both sides Robert Sachunsky 2025-11-15 14:32:22 +01:00
  • 3c15c4f7d4 back to rotate_image instead of rotation_image_new for deskewing Robert Sachunsky 2025-11-15 14:29:41 +01:00
  • 4475183f08 improve rules governing column split Robert Sachunsky 2025-11-14 03:39:36 +01:00
  • 4abc2ff572 rewrite/simplify manual reading order using recursive algorithm Robert Sachunsky 2025-11-14 03:05:02 +01:00
  • 95f76081d1 rename some more identifiers: Robert Sachunsky 2025-11-14 02:22:39 +01:00
  • 1a76ce177d do_order_of_regions: round contour centers Robert Sachunsky 2025-11-14 02:07:20 +01:00
  • 67003b837c . kba 2025-11-13 16:56:04 +01:00
  • c69248696b
    Fix syntax error in model list fixing_reading_order_issues_of_pr_206 vahidrezanezhad 2025-11-13 15:43:55 +01:00
  • d66549012f . kba 2025-11-13 14:57:28 +01:00
  • b9bc8e79c0 github ci: cache models with model_zoo default config as key kba 2025-11-13 13:58:38 +01:00
  • b34329dd61 tests: more path fixes kba 2025-11-13 12:21:48 +01:00
  • 9aeff6d155 tests: typo kba 2025-11-13 11:49:09 +01:00
  • a72be69958 tests: fix model download URL kba 2025-11-13 11:48:23 +01:00
  • 3afbce023d tests: adapt paths kba 2025-11-13 11:46:31 +01:00
  • ea38733e4c Machine-based reading order for right-to-left documents is enabled vahidrezanezhad 2025-11-12 19:09:49 +01:00
  • e60b0e5911 Revert to older deskew slope calculation — pairing between skewed and original contours was incorrect, so the original pairing logic has been restored. Also restored some original functions to ensure correct reading order detection. vahidrezanezhad 2025-11-12 18:24:50 +01:00
  • ed5b5c13dd Add test images; call TrOCR processor from the same directory as the TrOCR model vahidrezanezhad 2025-11-07 12:47:21 +01:00
  • 8732007aaf . kba 2025-11-06 14:15:33 +01:00
  • f902756ce1 try importing torch, then shapely, then tensorflow kba 2025-11-06 13:10:35 +01:00
  • 44037bc05d add layout marginalia test kba 2025-11-06 12:41:03 +01:00
  • d224b0f7e8 try with shapely.set_precision(...mode="keep_collpased") kba 2025-11-06 11:55:40 +01:00
  • 0d84e7da16 Merge remote-tracking branch 'origin/docs_and_minor_fixes' into model-zoo kba 2025-11-06 11:37:10 +01:00
  • 53e879e289 make *test: another typo; kba 2025-11-05 16:19:55 +01:00
  • e449dbab6d make *test: fix paths kba 2025-11-05 15:28:41 +01:00
  • 0bef6e297b make models: unzip to the versioned directory kba 2025-11-05 15:19:16 +01:00
  • 2c211095d7 make deps-test should not depend on the models kba 2025-11-05 15:02:55 +01:00
  • b6c7283b4d further debugging kba 2025-11-05 14:39:30 +01:00
  • 8f085db187
    Merge f90259d6e2 into 38c028c6b5 Clemens Neudecker 2025-10-30 21:26:01 +00:00
  • f90259d6e2 fix docs links docs_and_minor_fixes cneud 2025-10-30 22:24:54 +01:00
  • d5b7089bad Merge branch 'docs_and_minor_fixes' of https://github.com/qurator-spk/eynollah into docs_and_minor_fixes cneud 2025-10-30 22:17:41 +01:00
  • 9dbac280cc Revert "remove unnecessary backslash" cneud 2025-10-30 22:16:53 +01:00
  • 2d35a0598d Revert "replace list declaration with list literal (faster)" cneud 2025-10-30 22:16:48 +01:00
  • 70d8577a15 Revert "remove redundant parentheses" cneud 2025-10-30 22:16:41 +01:00
  • c9efbe1871
    refactor image layout in examples.md Clemens Neudecker 2025-10-30 16:52:59 +01:00