Commit graph

  • 34761d3ab5
    Merge ed5b5c13dd into 38c028c6b5 Konstantin Baierer 2025-11-07 11:47:31 +00:00
  • ed5b5c13dd Add test images; call TrOCR processor from the same directory as the TrOCR model model-zoo vahidrezanezhad 2025-11-07 12:47:21 +01:00
  • 8732007aaf . kba 2025-11-06 14:15:33 +01:00
  • f902756ce1 try importing torch, then shapely, then tensorflow kba 2025-11-06 13:10:35 +01:00
  • 44037bc05d add layout marginalia test kba 2025-11-06 12:41:03 +01:00
  • d224b0f7e8 try with shapely.set_precision(...mode="keep_collpased") kba 2025-11-06 11:55:40 +01:00
  • 0d84e7da16 Merge remote-tracking branch 'origin/docs_and_minor_fixes' into model-zoo kba 2025-11-06 11:37:10 +01:00
  • 53e879e289 make *test: another typo; kba 2025-11-05 16:19:55 +01:00
  • e449dbab6d make *test: fix paths kba 2025-11-05 15:28:41 +01:00
  • 0bef6e297b make models: unzip to the versioned directory kba 2025-11-05 15:19:16 +01:00
  • 2c211095d7 make deps-test should not depend on the models kba 2025-11-05 15:02:55 +01:00
  • b6c7283b4d further debugging kba 2025-11-05 14:39:30 +01:00
  • 8f085db187
    Merge f90259d6e2 into 38c028c6b5 Clemens Neudecker 2025-10-30 21:26:01 +00:00
  • f90259d6e2 fix docs links docs_and_minor_fixes cneud 2025-10-30 22:24:54 +01:00
  • d5b7089bad Merge branch 'docs_and_minor_fixes' of https://github.com/qurator-spk/eynollah into docs_and_minor_fixes cneud 2025-10-30 22:17:41 +01:00
  • 9dbac280cc Revert "remove unnecessary backslash" cneud 2025-10-30 22:16:53 +01:00
  • 2d35a0598d Revert "replace list declaration with list literal (faster)" cneud 2025-10-30 22:16:48 +01:00
  • 70d8577a15 Revert "remove redundant parentheses" cneud 2025-10-30 22:16:41 +01:00
  • c9efbe1871
    refactor image layout in examples.md Clemens Neudecker 2025-10-30 16:52:59 +01:00
  • 8782ef17b2 CI: 🔥 upgrade torch for debugging kba 2025-10-30 12:19:35 +01:00
  • 62d05917c5 test_layout: str(Path) kba 2025-10-30 12:17:38 +01:00
  • 7d3405928b Continue commit 5a0e4c3 using find_number_of_columns_in_document from eynollah v0.5.0. Ignoring deskewing_angle issues. Fixes some reading-order problems but further debugging is needed. fix_reading_order_bug vahidrezanezhad 2025-10-30 10:27:58 +01:00
  • b1e191b2ea reformat cli options table cneud 2025-10-29 22:30:58 +01:00
  • f6c0f56348 Update README.md cneud 2025-10-29 22:23:56 +01:00
  • 46a45f6b0e Create examples.md cneud 2025-10-29 22:23:48 +01:00
  • 15e6ecb95d make models: update URL kba 2025-10-29 21:27:10 +01:00
  • 600ebfeb50 make: fix to use single-archive ZIP kba 2025-10-29 21:07:49 +01:00
  • 9ab565fa02 model basedir might be a symlink kba 2025-10-29 21:02:42 +01:00
  • 4772fd17e2 missed changing override mechanism in eynollah_ocr kba 2025-10-29 20:47:13 +01:00
  • 29c273685f fix merge issues kba 2025-10-29 19:52:28 +01:00
  • de76eabc1d Merge branch 'cli-logging' into model-zoo kba 2025-10-29 19:41:01 +01:00
  • 5e22e9db64 model_zoo: make type str to reduce importing overhead kba 2025-10-29 19:08:32 +01:00
  • a913bdf7dc make --model-basedir and --model-overrides top-level CLI options kba 2025-10-29 18:24:17 +01:00
  • b6f82c72b9 refactor cli tests kba 2025-10-29 16:20:30 +01:00
  • 22d61e8d94 remove newspaper images from main readme cneud 2025-10-28 19:56:23 +01:00
  • 8822da17cf Merge remote-tracking branch 'origin/updating_docs' into docs_and_minor_fixes cneud 2025-10-28 19:53:12 +01:00
  • ef999c8f0a Merge branch 'model-zoo' of lx0145.sbb.spk-berlin.de:/data/eynollah into model-zoo kba 2025-10-27 11:45:20 +01:00
  • 294b6356d3 wip kba 2025-10-27 11:45:16 +01:00
  • 51d2680d9c wip kba 2025-10-27 11:44:59 +01:00
  • cf5a0bacd2
    Merge 19b2c3fa42 into 38c028c6b5 Robert Sachunsky 2025-10-25 13:36:48 +02:00
  • 19b2c3fa42 reading order: improve handling of headings and horizontal seps Robert Sachunsky 2025-10-24 22:51:19 +02:00
  • 3367462d18 return_boxes_of_images_by_order_of_reading_new: change arg order Robert Sachunsky 2025-10-24 22:46:46 +02:00
  • a2a9fe5117 delete_separator_around: simplify, eynollah: identifiers Robert Sachunsky 2025-10-24 02:35:04 +02:00
  • 3ebbc2d693 return_boxes_of_images_by_order_of_reading_new: indent Robert Sachunsky 2025-10-24 02:30:39 +02:00
  • 66a0e55e49 return_boxes_of_images_by_order_of_reading_new: avoid oversplits Robert Sachunsky 2025-10-24 02:15:13 +02:00
  • 6fbb5f8a12 return_boxes_of_images_by_order_of_reading_new: simplify Robert Sachunsky 2025-10-24 02:02:39 +02:00
  • 6cc5900943 find_num_col: add better plotting (but commented out) Robert Sachunsky 2025-10-24 01:55:07 +02:00
  • 5d15941b35 contours_in_same_horizon: simplify Robert Sachunsky 2025-10-24 01:51:59 +02:00
  • acee4c1bfe find_number_of_columns_in_document: simplify Robert Sachunsky 2025-10-24 01:43:41 +02:00
  • b2a79cc6ed return_x_start_end_mothers_childs_and_type_of_reading_order: fix+1 Robert Sachunsky 2025-10-24 01:31:52 +02:00
  • e2dfec75fb return_x_start_end_mothers_childs_and_type_of_reading_order: simplify and document Robert Sachunsky 2025-10-24 01:19:20 +02:00
  • 0fc4b2535d return_boxes_of_images_by_order_of_reading_new: fix no-mother case Robert Sachunsky 2025-10-20 16:47:35 +02:00
  • 7c3e418588 return_boxes_of_images_by_order_of_reading_new: simplify Robert Sachunsky 2025-10-20 16:13:51 +02:00
  • cd35241e81 find_number_of_columns_in_document: split headings at top+baseline Robert Sachunsky 2025-10-20 13:41:36 +02:00
  • 6192e5ba5c
    qualitative evaluation of ocr models are added to docs updating_docs vahidrezanezhad 2025-10-23 16:37:24 +02:00
  • ec1fd93dad wip kba 2025-10-23 11:58:23 +02:00
  • d0ad7a98b7 starting qualitative ocr evaluation vahidrezanezhad 2025-10-22 22:45:22 +02:00
  • 7b7714af2e completing ocr evaluations metric vahidrezanezhad 2025-10-22 22:42:37 +02:00
  • b56bb44284 providing ocr model evaluation metrics vahidrezanezhad 2025-10-22 21:30:06 +02:00
  • 59eb4fd3be
    images with ro are added to readme vahidrezanezhad 2025-10-22 19:04:01 +02:00
  • ab9ddd5214
    OCR examples are added to README vahidrezanezhad 2025-10-22 18:41:15 +02:00
  • 2fc723d292 extend README vahidrezanezhad 2025-10-22 18:29:14 +02:00
  • 874cfc247f . kba 2025-10-22 17:56:18 +02:00
  • 883546a6b8 eynollah models package kba 2025-10-22 16:38:05 +02:00
  • 04bc4a63d0 reorganize model_zoo kba 2025-10-22 16:04:48 +02:00
  • d94285b3ea rewrite model spec data structure kba 2025-10-22 13:07:35 +02:00
  • 146658f026 eynollah layout: fix trocr_processor model_zoo call kba 2025-10-22 10:47:09 +02:00
  • 4c8abfe19c eynollah_ocr: actually replace the model calls kba 2025-10-22 10:40:49 +02:00
  • 1337461d47 adopt image_enhancer to the zoo kba 2025-10-21 19:24:55 +02:00
  • f0c86672f8 adopt mb_ro_on_layout to the zoo kba 2025-10-21 17:55:08 +02:00
  • bcffa2e503 adopt binarizer to the zoo kba 2025-10-21 17:53:24 +02:00
  • de34a15809 Makefile: fix make models for OCR kba 2025-10-21 17:27:16 +02:00
  • 9d2b18d2af test_run: check log messages starting with eynollah kba 2025-10-21 13:29:55 +02:00
  • a53d5fc452 update docs/makefile to point to v0.6.0 models kba 2025-10-21 13:15:57 +02:00
  • c6b863b13f typing and asserts kba 2025-10-21 12:05:27 +02:00
  • 44b75eb36f cli: model -> model_basedir kba 2025-10-21 10:48:48 +02:00
  • 7d70835d22 small fixes to main readme cneud 2025-10-20 23:19:10 +02:00
  • 230e7cc705 integrate ocrd docs cneud 2025-10-20 22:52:54 +02:00
  • e5254dc6c5 integrate training docs cneud 2025-10-20 22:39:54 +02:00
  • 6e3399fe7a combine Docker docs cneud 2025-10-20 22:16:56 +02:00
  • 062f317d2e Introduce model_zoo to Eynollah_ocr kba 2025-10-20 21:14:52 +02:00
  • d609a532bf organize imports mostly kba 2025-10-20 19:46:07 +02:00
  • 48d1198d24 move Eynollah_ocr to separate module kba 2025-10-20 19:15:31 +02:00
  • b90cfdfcc4 adapt tests to -l being top-level option now cli-logging kba 2025-10-20 18:56:24 +02:00
  • a850ef39ea factor model loading in Eynollah to EynollahModelZoo kba 2025-10-20 18:34:44 +02:00
  • 5a0e4c3b0f find_number_of_columns_in_document: improve splitter rule Robert Sachunsky 2025-10-20 13:36:10 +02:00
  • 542d38ab43 find_number_of_columns_in_document: simplify, rename lineseps Robert Sachunsky 2025-10-20 13:34:56 +02:00
  • d3d599b010 order_of_regions: add better plotting (but commented out) Robert Sachunsky 2025-10-20 13:27:23 +02:00
  • c43a825d1d order_of_regions: filter out-of-image peaks Robert Sachunsky 2025-10-20 13:26:01 +02:00
  • 48761c3e12 find_num_col: simplify, add better plotting (but commented out) Robert Sachunsky 2025-10-20 13:20:12 +02:00
  • 184927fb54 find_num_cols: re-sort peaks when cutting n-best num_col_classifier Robert Sachunsky 2025-10-20 13:16:57 +02:00
  • 086c1880ac binarization: add option --overwrite, skip existing outputs Robert Sachunsky 2025-10-15 12:24:21 +02:00
  • c8455370a9 updating heuristics and ocr documentation vahidrezanezhad 2025-10-20 15:13:45 +02:00
  • 3ec5ceb22e
    Update flowchart vahidrezanezhad 2025-10-20 14:55:14 +02:00
  • 9d2dbb8388 updating model based reading orde detection vahidrezanezhad 2025-10-20 14:47:55 +02:00
  • 496a0e2ca4 readme and documentation updates cneud 2025-10-17 19:19:26 +02:00
  • f212ffa22d remove unnecessary backslash cneud 2025-10-17 18:27:18 +02:00
  • 9733d575bf replace list declaration with list literal (faster) cneud 2025-10-17 18:21:49 +02:00
  • 20a95365c2 remove redundant parentheses cneud 2025-10-17 18:19:00 +02:00
  • 2a1f892d72 expand keywords and supported Python versions cneud 2025-10-17 18:17:41 +02:00