eynollah/CHANGELOG.md
Robert Sachunsky 3aa7ad04fa 📝 update changelog
2025-09-30 23:14:52 +02:00

7.1 KiB

Change Log

Versioned according to Semantic Versioning.

Unreleased

Fixed:

  • 🔥 polygons: avoid invalid paths (use Polygon.buffer() instead of dilation etc.)
  • return_boxes_of_images_by_order_of_reading_new: avoid Numpy.dtype mismatch, simplify
  • return_boxes_of_images_by_order_of_reading_new: log any exceptions instead of ignoring
  • filter_contours_without_textline_inside: avoid removing from duplicate lists twice
  • get_marginals: exit early if no peaks found to avoid spurious overlap mask
  • get_smallest_skew: after shifting search range of rotation angle, use overall best result
  • Dockerfile: fix CUDA installation (cuDNN contested between Torch and TF due to extra OCR)
  • OCR: re-instate missing methods and fix utils_ocr function calls
  • 🔥 writer: SeparatorRegion needs SeparatorRegionType (not ImageRegionType) f458e3e
  • tests: switch from pytest-subtests to parametrize so we can use pytest-isolate (so CUDA memory gets freed between tests if running on GPU)

Changed:

  • polygons: slightly widen for regions and lines, increase for separators
  • various refactorings, some code style and identifier improvements
  • deskewing/multiprocessing: switch back to ProcessPoolExecutor (faster), but use shared memory if necessary, and switch back from loky to stdlib, and shutdown in del() instead of atexit
  • 🔥 OCR: switch CNN-RNN model to 20250930 version compatible with TF 2.12 on CPU, too
  • 🔥 writer: use @type='heading' instead of 'header' for headings
  • CI: update+improve model caching

0.5.0 - 2025-09-26

Fixed:

  • restoring the contour in the original image caused an error due to an empty tuple, #154

Added:

  • eynollah machine-based-reading-order CLI to run reading order detection, #175
  • eynollah enhancement CLI to run image enhancement, #175
  • Improved models for page extraction and reading order detection, #175

0.4.0 - 2025-04-07

Fixed:

  • allow empty imports for optional dependencies
  • avoid Numpy warnings (empty slices etc)
  • remove deprecated Numpy types
  • binarization CLI: make dir_in usable again

Added:

  • Continuous Deployment via Dockerhub and GHCR
  • CI: also test CLIs and OCR-D
  • CI: measure code coverage, annotate+upload reports
  • smoke-test: also check results
  • smoke-test: also test sbb-binarize
  • ocrd-test: analog for OCR-D CLI (segment and binarize)
  • pytest: add asserts, extend coverage, use subtests for various options
  • pytest: also add binarization
  • pytest: add dir_in mode (segment and binarize)
  • make install: control optional dependencies via EXTRAS variable
  • OCR-D: expose and describe recently added parameters:
    • ignore_page_extraction
    • allow_enhancement
    • textline_light
    • right_to_left
  • OCR-D: 🔥 integrate ocrd-sbb-binarize
  • add detection confidence in TextRegion/Coords/@conf (but only in light version and not for marginalia)

Changed:

  • Docker build: simplify, w/ OCR, conform to OCR-D spec

  • OCR-D: 🔥 migrate to core v3

    • initialize+setup only once
    • restrict number of parallel page workers to 1 (conflicts with existing multiprocessing; TF parts not mp-compatible)
    • do query maximally annotated page image (but filtering existing binarization/cropping/deskewing), rebase (as new @imageFilename) if necessary
    • add behavioural docstring
  • 🔥 refactor Eynollah API:

    • no more data (kw)args at init, but kwargs dir_in / image_filename for run()
    • no more data attributes, but function kwargs (pcgts, image_filename, image_pil, dir_in, override_dpi)
    • remove redundant TF session/model loaders (only load once during init)
    • factor run_single() out of run() (loop body), expose for independent calls (like OCR-D)
    • expose cache_images(), add dpi kwarg, set self._imgs
    • single-image mode writes PAGE file result (just as directory mode does)
  • CLI: assertions (instead of print+exit) for options checks

  • light mode: fine-tune ratio to better detect a region as header

0.3.1 - 2024-08-27

Fixed:

  • regression in OCR-D processor, #106
  • Expected Ptrcv::UMat for argument 'contour', #110
  • Memory usage explosion with very narrow images (e.g. book spine), #67

0.3.0 - 2023-05-13

Changed:

  • Eynollah light integration, #86
  • use PEP420 style qurator namespace, #97
  • set_memory_growth to all GPU devices alike, #100

Fixed:

  • PAGE-XML coordinates can have self-intersections, #20
  • reading order representation (XML order vs index), #22
  • allow cropping separately, #26
  • Order of regions, #51
  • error while running inference, #75
  • Eynollah crashes while processing image, #77
  • ValueError: bad marshal data, #87
  • contour extraction: inhomogeneous shape, #92
  • Confusing model dir variables, #93
  • New release?, #96

0.2.0 - 2023-03-24

Changed:

  • Convert default model from HDFS to TF SavedModel, #91

Added:

  • parmeter tables to toggle table detectino, #91
  • default model described in ocrd-tool.json, #91

0.1.0 - 2023-03-22

Fixed:

  • Do not produce spurious TextEquiv, #68
  • Less spammy logging, #64, #65, #71

Changed:

  • Upgrade to tensorflow 2.4.0, #74
  • Improved README
  • CI: test for python 3.7+, #90

0.0.11 - 2022-02-02

Fixed:

0.0.10 - 2021-09-27

Fixed:

  • call to uild_pagexml_no_full_layout for empty pages, #52

0.0.9 - 2021-08-16

Added:

  • Table detection, #48

Fixed:

  • Catch exception, #47

0.0.8 - 2021-07-27

Fixed:

  • pc:PcGts/@pcGtsId was not set, #49

0.0.7 - 2021-07-01

Fixed:

  • slopes/slopes_h retval/arguments mixed up, #45, #46

0.0.6 - 2021-06-22

Fixed:

0.0.5 - 2021-05-19

Changed:

  • Remove allow_enhancement parameter, #42

0.0.4 - 2021-05-18

  • fix contour bug, #40

0.0.3 - 2021-05-11

  • fix NaN bug, #38

0.0.2 - 2021-05-04

Fixed:

  • prevent negative coordinates for textlines in marginals
  • fix a bug in the contour logic, #38
  • the binarization model is added into the models and now binarization of input can be done at the first stage of eynollah's pipline. This option can be turned on by -ib (-input_binary) argument. This is suggested for very dark or bright documents

0.0.1 - 2021-04-22

Initial release