Commit graph

  • 90ee2d61dc textline segmentation is masked with drop capitals vahidrezanezhad 2024-10-28 20:56:06 +01:00
  • 5037e9896d Merge branch 'machine_based_reading_order_integration' of https://github.com/qurator-spk/eynollah into machine_based_reading_order_integration vahidrezanezhad 2024-10-25 19:47:20 +02:00
  • 82281bd6cf fixing a bug occuring with reading order + Slro option with no patch textline model and thresholding artificial class vahidrezanezhad 2024-10-25 19:42:48 +02:00
  • 328d33e3dc Temporary commit – textline prediction without patches vahidrezanezhad 2024-10-23 16:55:41 +02:00
  • 70772d4104 binarization as a standalone command vahidrezanezhad 2024-10-21 23:46:38 +02:00
  • f93fa12441 doing more multiprocessing in order to make the process faster vahidrezanezhad 2024-10-18 09:14:42 +02:00
  • 3ef4eac24c textlines of textregions are extracted in a faster way + early layout for all documents is done with no patches model and on rgb input vahidrezanezhad 2024-10-17 19:12:28 +02:00
  • bc9dddd2c0
    Update README.md Clemens Neudecker 2024-10-16 14:21:48 +02:00
  • 21893910b8
    relax tf2 requirement to < 2.13 Clemens Neudecker 2024-10-16 14:20:53 +02:00
  • 396ffcd185 Providing additional comments refactoring-2024-08 vahidrezanezhad 2024-10-07 16:08:27 +02:00
  • 1da4b7f589 updating light version vahidrezanezhad 2024-10-07 10:55:10 +02:00
  • 543ed4bc38 -light version need -tll to be enabled otherwise the process will be ended. vahidrezanezhad 2024-10-02 14:09:13 +02:00
  • 51f6ef63f5
    Merge pull request #137 from qurator-spk/dockerfile Clemens Neudecker 2024-10-01 17:08:22 +02:00
  • b13759fdcf ci: smoke-test make docker kba 2024-10-01 15:38:39 +02:00
  • c487be2a1d dockerfile: use src-layout kba 2024-10-01 15:38:01 +02:00
  • 7eb1390583 Merge branch 'main' into dockerfile kba 2024-10-01 15:25:56 +02:00
  • 4bed606eb2 just started adding comments to clarify how it works. vahidrezanezhad 2024-09-30 23:48:06 +02:00
  • ab63d5ba40 updating light version features vahidrezanezhad 2024-09-30 21:28:39 +02:00
  • 1774076f4a updating light version. Remove textlines or textregion contours inside a bigger one vahidrezanezhad 2024-09-30 16:10:29 +02:00
  • ad32316217 updating light version vahidrezanezhad 2024-09-27 20:59:01 +02:00
  • 133091137d dilation of textregions and marginals are accomplished vahidrezanezhad 2024-09-27 13:57:01 +02:00
  • 95effe54a0 updating textregions dilation vahidrezanezhad 2024-09-25 20:00:53 +02:00
  • b33739adee parametriyation in the case of textline contours dilation is accomplished vahidrezanezhad 2024-09-24 16:06:27 +02:00
  • 6626dc6866 updating textline dilation parameters vahidrezanezhad 2024-09-23 15:50:37 +02:00
  • 62f8ae4860 updating dilation of textlines and text regions vahidrezanezhad 2024-09-23 14:03:07 +02:00
  • 7f08458436 dilation of text regions without opencv vahidrezanezhad 2024-09-21 14:39:54 +02:00
  • 5d680136a4 updating light version vahidrezanezhad 2024-09-21 01:04:28 +02:00
  • b9e8959c4a update of light versions vahidrezanezhad 2024-09-20 16:33:13 +02:00
  • 2d18739d9b postprocessing of textline contour dilation + skip layout and reading order passed as an argument vahidrezanezhad 2024-09-20 15:08:09 +02:00
  • 4af0bc079c
    Merge pull request #132 from qurator-spk/extracting_images_only Clemens Neudecker 2024-09-20 09:35:40 +02:00
  • 5a07cd9cfa the most effective version of contours dilation without opencv and all at once vahidrezanezhad 2024-09-19 16:21:55 +02:00
  • d168edfd77
    Update cli.py to block other processing in the case of extract_image_only michalbubula 2024-09-19 15:20:37 +02:00
  • 723f27bec4
    Add -eoi option to README.md michalbubula 2024-09-19 14:41:17 +02:00
  • 74a0699f6b extracting images only now works for a single image input vahidrezanezhad 2024-09-19 11:20:13 +02:00
  • a1f1f98de3 updating scaling contours vahidrezanezhad 2024-09-18 00:08:54 +02:00
  • 327b446a16
    update Makefile with v0.3.1 models Clemens Neudecker 2024-09-17 21:39:17 +02:00
  • 351e9a897a
    update ocrd-tool.json with v0.3.1 models Clemens Neudecker 2024-09-17 21:32:23 +02:00
  • 21380fc870 scaling contours without dilation vahidrezanezhad 2024-09-17 15:06:41 +02:00
  • 478edc804a Add Dockerfile and make docker target kba 2024-09-16 18:21:14 +02:00
  • 1b18ae874b passing number of columns as an argument vahidrezanezhad 2024-09-13 00:52:06 +02:00
  • 2c93904985 avoiding double binarization vahidrezanezhad 2024-09-12 17:35:28 +02:00
  • f0b49073b7 adding option for textline detection in printspace vahidrezanezhad 2024-09-03 23:10:38 +02:00
  • c156a1612e
    Exclude run_image_extraction_over_ppn_lists.py from merge Clemens Neudecker 2024-09-03 20:03:44 +02:00
  • 6b2e5d110e all tests are passed vahidrezanezhad 2024-09-03 13:55:55 +02:00
  • c3a4a1bba7 resolving issue #110 in a better way vahidrezanezhad 2024-09-03 13:14:10 +02:00
  • b6d3d2bdbf fix indentation cneud 2024-09-02 20:11:42 +02:00
  • de32d86fb6 Merge branch 'refs/heads/main' into extracting_images_only cneud 2024-09-02 19:55:33 +02:00
  • 0f87974b0c writing drop capitals in xml output + and may resolve issue #110 vahidrezanezhad 2024-09-02 16:21:07 +02:00
  • c6e0e058d0 Merge branch 'main' into v3-api kba 2024-09-02 14:53:37 +02:00
  • fdedae2406 require ocrd>=3.0.0b4 kba 2024-09-02 11:47:57 +02:00
  • f9c2d85dd7 Merge branch 'main' into v3-api kba 2024-09-02 11:46:56 +02:00
  • 9b274dcc20
    Merge pull request #134 from bertsky/v3-api Konstantin Baierer 2024-09-02 11:46:33 +02:00
  • 17eafc1ccb adapt tool json to v3 Robert Sachunsky 2024-09-01 10:15:31 +02:00
  • 1e902571ea undo customizing metadata_filename (not correct with namespace pkg support in core) Robert Sachunsky 2024-09-01 10:15:11 +02:00
  • dfc4ac2538 setuptools: fix (packages.find.where prevented finding namespace qurator) Robert Sachunsky 2024-08-30 22:46:51 +02:00
  • 256a7c347f
    Merge pull request #133 from qurator-spk/src-layout Clemens Neudecker 2024-08-29 23:13:37 +02:00
  • 84b844203d switch from qurator namespace to src-layout kba 2024-08-29 17:11:29 +02:00
  • 9367f86483 remove setup.py stub completely kba 2024-08-29 17:06:39 +02:00
  • 93005959e5 inference batch size debugged vahidrezanezhad 2024-08-27 18:13:46 +02:00
  • 62314c453c fully transition to pyproject kba 2024-08-27 15:04:57 +02:00
  • a5c7f223d1 📦 v0.3.1 v0.3.1 kba 2024-08-27 14:54:59 +02:00
  • 9ae0575436 📝 changelog kba 2024-08-27 14:52:01 +02:00
  • 7ae6a8776f ignoring dpi check by light version vahidrezanezhad 2024-08-26 16:02:10 +02:00
  • aef46a4669 require ocrd >= 3.0.0b1 kba 2024-08-26 11:31:13 +02:00
  • 7b92620a10
    processor: no more DPI info lost Konstantin Baierer 2024-08-26 10:45:53 +02:00
  • d26079db85 procesor.py: simplify imports further kba 2024-08-26 10:40:15 +02:00
  • ecd202ea4c
    processor.py: Simplify import Konstantin Baierer 2024-08-26 10:39:22 +02:00
  • d98fa2a85b check_dpi: fix Pillow type detection Robert Sachunsky 2024-05-28 14:07:45 +02:00
  • 61bcb435ae processor: reuse loaded models across pages, use derived images Robert Sachunsky 2023-06-11 22:14:41 +02:00
  • c37d95dedf non-legacy namespace package # Conflicts: # setup.py Robert Sachunsky 2024-05-23 21:19:33 +02:00
  • 49c1a8f384 fix namespace pkg setup Robert Sachunsky 2024-05-24 14:29:57 +00:00
  • 3381e5a015 adapt to OcrdFile.local_filename now :Path # Conflicts: # qurator/eynollah/processor.py Robert Sachunsky 2024-01-24 19:33:49 +01:00
  • 8dfecb70d4 adapt to ocrd>=2.54 url vs local_filename Robert Sachunsky 2024-01-19 16:17:02 +00:00
  • 8ec9fc6da2 Merge remote-tracking branch 'origin/refactor' into refactoring-2024-08 refactoring-2024-08-merged kba 2024-08-24 18:51:44 +02:00
  • d7caeb2b05 ocrd interface: add ignore_page_extraction kba 2024-08-24 18:11:15 +02:00
  • ddcc0198bd ocrd interface: add right_to_left kba 2024-08-24 18:05:21 +02:00
  • 39b16e5978 ocrd interface: add textline_light kba 2024-08-24 18:00:45 +02:00
  • 87adc4b0c6 ocrd interface: add light_mode parameter kba 2024-08-24 16:51:52 +02:00
  • 0d83db7bc4 update processor to the latest change in bertsky/core#14 kba 2024-08-24 16:46:25 +02:00
  • b954a55d26 move self.model_* to EynollaDirs kba 2024-08-24 16:15:21 +02:00
  • 04e79002b3 making light version faster for 1 and 2 columns images vahidrezanezhad 2024-08-24 12:54:19 +02:00
  • 59dbffea59 remove commented out code kba 2024-08-23 21:35:43 +02:00
  • ac2958edb1 separate_lines.separate_lines_vertical_cont: remove unused args kba 2024-08-23 21:25:37 +02:00
  • 9109e88d50 wip typing kba 2024-08-23 21:22:29 +02:00
  • a5b178e1d1 remove dead code (found with vulture) kba 2024-08-23 21:11:48 +02:00
  • 9ee9c4403b introduce self.batch_processing_mode to clarify when data is read from dir_in kba 2024-08-23 21:04:23 +02:00
  • 532ee6fe41 rfct: introduce EynollahDirs to reduce self.dir_* proliferation kba 2024-08-23 20:55:14 +02:00
  • 762a7a058e adapt to one-arg start_new_session_and_model and rename load_model kba 2024-08-23 20:29:44 +02:00
  • 8c4bfa229f rfct: move all tensorflow/keras imports and hacks to utils.tf kba 2024-08-23 20:10:49 +02:00
  • b15b1bdcd5 rfct: remove unused _old method kba 2024-08-23 19:59:12 +02:00
  • 2c9727f9c9 move keras-specific classes to utils.keras, clean up imports kba 2024-08-23 19:53:04 +02:00
  • d7a774ebd2 test_run: require EYNOLLAH_MODELS to be defined in environ kba 2024-08-23 19:52:02 +02:00
  • d6a72709a1 remove unused image_filename_stem kwarg kba 2024-08-23 18:53:28 +02:00
  • 9ce02a569e ocrd-tool: add "allow_enhancement" parameter kba 2024-08-23 18:32:59 +02:00
  • 4a13781ef4 class Eynollah: add typing, consistent interface in CLI and OCR-D CLI kba 2024-08-23 18:32:29 +02:00
  • 0a3f525f0a port processor to core v3 kba 2024-08-23 18:19:28 +02:00
  • 78bfa97c06
    Merge pull request #129 from qurator-spk/resolving_issue_106 Clemens Neudecker 2024-08-23 14:10:26 +02:00
  • 84d05bd0ae s,url,local_filename, kba 2024-08-23 14:01:20 +02:00
  • c10a525675 inference with batch size bigger than 1 vahidrezanezhad 2024-08-23 02:18:16 +02:00
  • 7f99526b9d update Makefile model location cneud 2024-08-15 23:59:18 +02:00