Commit Graph

  • 1d5589bfa9
    Merge 92bfac4b41 into bc9dddd2c0 vahidrezanezhad 2024-12-20 14:47:28 +0000
  • 92bfac4b41 Provide OCR as an option to process a directory of XML files, incorporating layout and text line coordinates. machine_based_reading_order_integration vahidrezanezhad 2024-12-20 15:47:21 +0100
  • fbeef79d50 adding scatter_nd inference vahidrezanezhad 2024-12-16 01:11:54 +0100
  • 21dc18c682
    Merge f93c6c288d into 871d7bfc5a Robert Sachunsky 2024-12-14 01:50:23 +0000
  • f93c6c288d function of patch-wise inference with scatter_nd is added vahidrezanezhad 2024-12-14 02:50:17 +0100
  • 0e8c561618 debugging issues vahidrezanezhad 2024-12-14 00:24:29 +0100
  • e9c0d716f6 CI: install optional dependencies, too Robert Sachunsky 2024-12-11 23:48:56 +0000
  • dcaf796283 change polarity of orientation angle (PAGE schema required cw=positive) Robert Sachunsky 2024-12-11 23:07:56 +0000
  • b4b0890294 add option to overwrite output xml, but skip by default if file exists Robert Sachunsky 2024-12-11 18:45:18 +0000
  • b9ca7a6191 log num_cols-dependent resizing Robert Sachunsky 2024-12-11 18:44:54 +0000
  • 9270ea4550 annotate region angles in PAGE Robert Sachunsky 2024-12-11 18:37:20 +0000
  • 3b70b11ea6 avoid deskewing patches if binary-empty Robert Sachunsky 2024-12-11 18:36:20 +0000
  • 7e9ee90e6e switch from (ad-hoc) mp.Pool to (attribute) concurrent.futures.ProcessPoolExecutor Robert Sachunsky 2024-12-11 12:18:29 +0000
  • 68456ea002 do_work_of_slopes_new*, do_back_rotation_and_get_cnt_back, do_work_of_contours_in_image: use mp.Pool, simplify Robert Sachunsky 2024-12-11 11:30:38 +0000
  • 25e967397d exit early if no text regions found (to avoid segfault) Robert Sachunsky 2024-12-11 11:24:56 +0000
  • 21efea8711 no del on function argument Robert Sachunsky 2024-12-11 18:36:57 +0000
  • 5e0c1da711 simplify Robert Sachunsky 2024-12-11 00:18:58 +0000
  • 54cb15056b do_image_rotation / return_deskew_slop: avoid code duplication, simplify via mp.Pool Robert Sachunsky 2024-12-09 16:37:34 +0000
  • 6fe02df973 do_image_rotation: fix f93fa12 (do return results) Robert Sachunsky 2024-12-09 16:35:31 +0000
  • d68017037c do_prediction: trigger GC to avoid CUDA OOM Robert Sachunsky 2024-12-09 11:27:11 +0000
  • ad748d0039 do_prediction: avoid code duplication Robert Sachunsky 2024-12-09 10:55:41 +0000
  • c3163caefd avoid indentation Robert Sachunsky 2024-12-05 14:28:17 +0000
  • 055463d23a avoid indentation Robert Sachunsky 2024-12-05 09:43:30 +0000
  • aaea2ef463 simplify Robert Sachunsky 2024-12-05 09:40:02 +0000
  • 3d88b207fc run: log instead of print Robert Sachunsky 2024-12-05 09:39:55 +0000
  • a520bd1f77 wrap extremely long lines Robert Sachunsky 2024-12-04 22:49:34 +0000
  • cd4e426977 avoid indentation (skip_layout_and_reading_order) Robert Sachunsky 2024-12-04 22:11:34 +0000
  • 5b82320707 avoid indentation Robert Sachunsky 2024-12-04 22:09:32 +0000
  • 9f12fa241d log-level: only set 'eynollah' logger level Robert Sachunsky 2024-12-04 22:09:15 +0000
  • 14beb46224 simplify loading models w/o dir_in mode Robert Sachunsky 2024-12-04 21:07:26 +0000
  • 329fac23f6 do not reload enhancement model in dir_in mode, simplify Robert Sachunsky 2024-12-04 18:29:49 +0000
  • 3b9a29bc5c simplify dir_in conditionals Robert Sachunsky 2024-12-04 18:19:54 +0000
  • 7ae64f3717 RO model: do not reload when in dir_in mode Robert Sachunsky 2024-12-04 16:18:35 +0000
  • f765e2603b move Torch to optional dependencies (to avoid clash with TF over CuDNN) Robert Sachunsky 2024-12-04 15:57:13 +0000
  • 871d7bfc5a fixed: machine based reading order cause tuple index out of range error if number of textregion is one. vahidrezanezhad 2024-12-04 16:41:00 +0100
  • 6aad006f4c filter textregions without textline vahidrezanezhad 2024-12-02 12:43:57 +0100
  • 1083d1c7fb gha: try to free disk space kba 2024-11-25 19:32:42 +0100
  • 8014a9e416
    Update Makefile vahidrezanezhad 2024-11-22 19:47:06 +0100
  • 3000255a24
    Update Makefile vahidrezanezhad 2024-11-22 12:40:21 +0100
  • 1746920275
    Update Makefile vahidrezanezhad 2024-11-21 12:08:29 +0100
  • b622494f34 new table detection model is integrated vahidrezanezhad 2024-11-21 02:16:22 +0100
  • d9f79c3404 fixing IndexError by reading order detection vahidrezanezhad 2024-11-18 10:15:19 +0100
  • 5fa8ca46a4 updating requirements vahidrezanezhad 2024-11-14 17:35:00 +0100
  • ce5b611296 tests are passed - new models by the way should be uploaded vahidrezanezhad 2024-11-14 17:18:07 +0100
  • f43c49c508 textlines of drop capitals are connected to corresponding textline if possible otherwise they are inserted in corresponding textregion vahidrezanezhad 2024-11-13 11:53:56 +0100
  • 22b0b07a73 drop capital and marginals extraction is updated vahidrezanezhad 2024-11-11 19:01:40 +0100
  • 1ae77e61c8
    Update requirements.txt Clemens Neudecker 2024-11-11 14:11:36 +0100
  • 8409de0e58 sbb_binarization is integrated into eynollah works in framework of ocrd - sbb_binarization in ocrd works for individual images by the way as standalone flowing from directory can be used now. For eynollah in ocrd framework I have added -light version as default parameter. vahidrezanezhad 2024-11-10 19:34:43 +0100
  • 0914b5ff8a resolve merge conflict of main branch with machine based reading order branch vahidrezanezhad 2024-11-06 00:34:00 +0100
  • 6aee70d0cd Resolve merge conflict of main and machine based reading order branch vahidrezanezhad 2024-11-06 00:10:25 +0100
  • df96daff92 rm requirements.txt extracting_images_only vahidrezanezhad 2024-11-05 23:17:03 +0100
  • bceeeb56c1
    Merge pull request #138 from qurator-spk/extracting_images_only vahidrezanezhad 2024-11-05 22:10:51 +0100
  • f7e5fb917f resolving merge conflict of machine based reading order and extracting only images branches vahidrezanezhad 2024-11-05 22:09:39 +0100
  • 751b0102f7 updating early layout inference for light version vahidrezanezhad 2024-11-05 19:50:18 +0100
  • e796a99c5c updating inference for early layout in the case of documents with number of columns bigger than 2 vahidrezanezhad 2024-10-30 15:02:50 +0100
  • 438df52287 updating vahidrezanezhad 2024-10-30 00:52:09 +0100
  • 90ee2d61dc textline segmentation is masked with drop capitals vahidrezanezhad 2024-10-28 20:56:06 +0100
  • 5037e9896d Merge branch 'machine_based_reading_order_integration' of https://github.com/qurator-spk/eynollah into machine_based_reading_order_integration vahidrezanezhad 2024-10-25 19:47:20 +0200
  • 82281bd6cf fixing a bug occuring with reading order + Slro option with no patch textline model and thresholding artificial class vahidrezanezhad 2024-10-25 19:42:48 +0200
  • 328d33e3dc Temporary commit – textline prediction without patches vahidrezanezhad 2024-10-23 16:55:41 +0200
  • 70772d4104 binarization as a standalone command vahidrezanezhad 2024-10-21 23:46:38 +0200
  • f93fa12441 doing more multiprocessing in order to make the process faster vahidrezanezhad 2024-10-18 09:14:42 +0200
  • 3ef4eac24c textlines of textregions are extracted in a faster way + early layout for all documents is done with no patches model and on rgb input vahidrezanezhad 2024-10-17 19:12:28 +0200
  • bc9dddd2c0
    Update README.md main Clemens Neudecker 2024-10-16 14:21:48 +0200
  • 21893910b8
    relax tf2 requirement to < 2.13 Clemens Neudecker 2024-10-16 14:20:53 +0200
  • 396ffcd185 Providing additional comments refactoring-2024-08 vahidrezanezhad 2024-10-07 16:08:27 +0200
  • 1da4b7f589 updating light version vahidrezanezhad 2024-10-07 10:55:10 +0200
  • 543ed4bc38 -light version need -tll to be enabled otherwise the process will be ended. vahidrezanezhad 2024-10-02 14:09:13 +0200
  • 51f6ef63f5
    Merge pull request #137 from qurator-spk/dockerfile Clemens Neudecker 2024-10-01 17:08:22 +0200
  • b13759fdcf ci: smoke-test make docker dockerfile kba 2024-10-01 15:38:39 +0200
  • c487be2a1d dockerfile: use src-layout kba 2024-10-01 15:38:01 +0200
  • 7eb1390583 Merge branch 'main' into dockerfile kba 2024-10-01 15:25:56 +0200
  • 4bed606eb2 just started adding comments to clarify how it works. vahidrezanezhad 2024-09-30 23:48:06 +0200
  • ab63d5ba40 updating light version features vahidrezanezhad 2024-09-30 21:28:39 +0200
  • 1774076f4a updating light version. Remove textlines or textregion contours inside a bigger one vahidrezanezhad 2024-09-30 16:10:29 +0200
  • ad32316217 updating light version vahidrezanezhad 2024-09-27 20:59:01 +0200
  • 133091137d dilation of textregions and marginals are accomplished vahidrezanezhad 2024-09-27 13:57:01 +0200
  • 95effe54a0 updating textregions dilation vahidrezanezhad 2024-09-25 20:00:53 +0200
  • b33739adee parametriyation in the case of textline contours dilation is accomplished vahidrezanezhad 2024-09-24 16:06:27 +0200
  • 6626dc6866 updating textline dilation parameters vahidrezanezhad 2024-09-23 15:50:37 +0200
  • 62f8ae4860 updating dilation of textlines and text regions vahidrezanezhad 2024-09-23 14:03:07 +0200
  • 7f08458436 dilation of text regions without opencv vahidrezanezhad 2024-09-21 14:39:54 +0200
  • b0a7f62ada pep 8 code style pep-8-code-style cneud 2024-09-21 01:51:35 +0200
  • 5d680136a4 updating light version vahidrezanezhad 2024-09-21 01:04:28 +0200
  • 593cf64693 pep 8 code style cneud 2024-09-20 23:39:34 +0200
  • 826d38b865 pep 8 code style cneud 2024-09-20 23:10:02 +0200
  • dbee1a3084 switch from qurator namespace to src-layout cneud 2024-09-20 20:18:39 +0200
  • b9e8959c4a update of light versions vahidrezanezhad 2024-09-20 16:33:13 +0200
  • 2d18739d9b postprocessing of textline contour dilation + skip layout and reading order passed as an argument vahidrezanezhad 2024-09-20 15:08:09 +0200
  • 4af0bc079c
    Merge pull request #132 from qurator-spk/extracting_images_only Clemens Neudecker 2024-09-20 09:35:40 +0200
  • 5a07cd9cfa the most effective version of contours dilation without opencv and all at once vahidrezanezhad 2024-09-19 16:21:55 +0200
  • d168edfd77
    Update cli.py to block other processing in the case of extract_image_only michalbubula 2024-09-19 15:20:37 +0200
  • 723f27bec4
    Add -eoi option to README.md michalbubula 2024-09-19 14:41:17 +0200
  • 74a0699f6b extracting images only now works for a single image input vahidrezanezhad 2024-09-19 11:20:13 +0200
  • a1f1f98de3 updating scaling contours vahidrezanezhad 2024-09-18 00:08:54 +0200
  • 327b446a16
    update Makefile with v0.3.1 models Clemens Neudecker 2024-09-17 21:39:17 +0200
  • 351e9a897a
    update `ocrd-tool.json` with v0.3.1 models Clemens Neudecker 2024-09-17 21:32:23 +0200
  • 21380fc870 scaling contours without dilation vahidrezanezhad 2024-09-17 15:06:41 +0200
  • 478edc804a Add Dockerfile and make docker target kba 2024-09-16 18:21:14 +0200
  • 1b18ae874b passing number of columns as an argument vahidrezanezhad 2024-09-13 00:52:06 +0200