You cannot select more than 25 topics
Topics must start with a letter or number, can include dashes ('-') and can be up to 35 characters long.
201 lines
5.2 KiB
Markdown
201 lines
5.2 KiB
Markdown
Change Log
|
|
==========
|
|
|
|
Versioned according to [Semantic Versioning](http://semver.org/).
|
|
|
|
## Unreleased
|
|
|
|
## [0.4.0] - 2025-04-07
|
|
|
|
Fixed:
|
|
|
|
* allow empty imports for optional dependencies
|
|
* avoid Numpy warnings (empty slices etc)
|
|
* remove deprecated Numpy types
|
|
* binarization CLI: make `dir_in` usable again
|
|
|
|
Added:
|
|
|
|
* Continuous Deployment via Dockerhub and GHCR
|
|
* CI: also test CLIs and OCR-D
|
|
* CI: measure code coverage, annotate+upload reports
|
|
* smoke-test: also check results
|
|
* smoke-test: also test sbb-binarize
|
|
* ocrd-test: analog for OCR-D CLI (segment and binarize)
|
|
* pytest: add asserts, extend coverage, use subtests for various options
|
|
* pytest: also add binarization
|
|
* pytest: add `dir_in` mode (segment and binarize)
|
|
* make install: control optional dependencies via `EXTRAS` variable
|
|
* OCR-D: expose and describe recently added parameters:
|
|
- `ignore_page_extraction`
|
|
- `allow_enhancement`
|
|
- `textline_light`
|
|
- `right_to_left`
|
|
* OCR-D: :fire: integrate ocrd-sbb-binarize
|
|
* add detection confidence in `TextRegion/Coords/@conf`
|
|
(but only in light version and not for marginalia)
|
|
|
|
Changed:
|
|
|
|
* Docker build: simplify, w/ `OCR`, conform to OCR-D spec
|
|
* OCR-D: :fire: migrate to core v3
|
|
- initialize+setup only once
|
|
- restrict number of parallel page workers to 1
|
|
(conflicts with existing multiprocessing; TF parts not mp-compatible)
|
|
- do query maximally annotated page image
|
|
(but filtering existing binarization/cropping/deskewing),
|
|
rebase (as new `@imageFilename`) if necessary
|
|
- add behavioural docstring
|
|
|
|
* :fire: refactor `Eynollah` API:
|
|
- no more data (kw)args at init,
|
|
but kwargs `dir_in` / `image_filename` for `run()`
|
|
- no more data attributes, but function kwargs
|
|
(`pcgts`, `image_filename`, `image_pil`, `dir_in`, `override_dpi`)
|
|
- remove redundant TF session/model loaders
|
|
(only load once during init)
|
|
- factor `run_single()` out of `run()` (loop body),
|
|
expose for independent calls (like OCR-D)
|
|
- expose `cache_images()`, add `dpi` kwarg, set `self._imgs`
|
|
- single-image mode writes PAGE file result
|
|
(just as directory mode does)
|
|
|
|
* CLI: assertions (instead of print+exit) for options checks
|
|
* light mode: fine-tune ratio to better detect a region as header
|
|
|
|
## [0.3.1] - 2024-08-27
|
|
|
|
Fixed:
|
|
|
|
* regression in OCR-D processor, #106
|
|
* Expected Ptrcv::UMat for argument 'contour', #110
|
|
* Memory usage explosion with very narrow images (e.g. book spine), #67
|
|
|
|
## [0.3.0] - 2023-05-13
|
|
|
|
Changed:
|
|
|
|
* Eynollah light integration, #86
|
|
* use PEP420 style qurator namespace, #97
|
|
* set_memory_growth to all GPU devices alike, #100
|
|
|
|
Fixed:
|
|
|
|
* PAGE-XML coordinates can have self-intersections, #20
|
|
* reading order representation (XML order vs index), #22
|
|
* allow cropping separately, #26
|
|
* Order of regions, #51
|
|
* error while running inference, #75
|
|
* Eynollah crashes while processing image, #77
|
|
* ValueError: bad marshal data, #87
|
|
* contour extraction: inhomogeneous shape, #92
|
|
* Confusing model dir variables, #93
|
|
* New release?, #96
|
|
|
|
## [0.2.0] - 2023-03-24
|
|
|
|
Changed:
|
|
|
|
* Convert default model from HDFS to TF SavedModel, #91
|
|
|
|
Added:
|
|
|
|
* parmeter `tables` to toggle table detectino, #91
|
|
* default model described in ocrd-tool.json, #91
|
|
|
|
## [0.1.0] - 2023-03-22
|
|
|
|
Fixed:
|
|
|
|
* Do not produce spurious `TextEquiv`, #68
|
|
* Less spammy logging, #64, #65, #71
|
|
|
|
Changed:
|
|
|
|
* Upgrade to tensorflow 2.4.0, #74
|
|
* Improved README
|
|
* CI: test for python 3.7+, #90
|
|
|
|
## [0.0.11] - 2022-02-02
|
|
|
|
Fixed:
|
|
|
|
* `models` parameter should have `content-type`, #61, OCR-D/core#777
|
|
|
|
## [0.0.10] - 2021-09-27
|
|
|
|
Fixed:
|
|
|
|
* call to `uild_pagexml_no_full_layout` for empty pages, #52
|
|
|
|
## [0.0.9] - 2021-08-16
|
|
|
|
Added:
|
|
|
|
* Table detection, #48
|
|
|
|
Fixed:
|
|
|
|
* Catch exception, #47
|
|
|
|
## [0.0.8] - 2021-07-27
|
|
|
|
Fixed:
|
|
|
|
* `pc:PcGts/@pcGtsId` was not set, #49
|
|
|
|
## [0.0.7] - 2021-07-01
|
|
|
|
Fixed:
|
|
|
|
* `slopes`/`slopes_h` retval/arguments mixed up, #45, #46
|
|
|
|
## [0.0.6] - 2021-06-22
|
|
|
|
Fixed:
|
|
|
|
* Cast arguments to opencv2 to python native types, #43, #44, opencv/opencv#20186
|
|
|
|
## [0.0.5] - 2021-05-19
|
|
|
|
Changed:
|
|
|
|
* Remove `allow_enhancement` parameter, #42
|
|
|
|
## [0.0.4] - 2021-05-18
|
|
|
|
* fix contour bug, #40
|
|
|
|
## [0.0.3] - 2021-05-11
|
|
|
|
* fix NaN bug, #38
|
|
|
|
## [0.0.2] - 2021-05-04
|
|
|
|
Fixed:
|
|
|
|
* prevent negative coordinates for textlines in marginals
|
|
* fix a bug in the contour logic, #38
|
|
* the binarization model is added into the models and now binarization of input can be done at the first stage of eynollah's pipline. This option can be turned on by -ib (-input_binary) argument. This is suggested for very dark or bright documents
|
|
|
|
## [0.0.1] - 2021-04-22
|
|
|
|
Initial release
|
|
|
|
<!-- link-labels -->
|
|
[0.3.1]: ../../compare/v0.3.1...v0.3.0
|
|
[0.3.0]: ../../compare/v0.3.0...v0.2.0
|
|
[0.2.0]: ../../compare/v0.2.0...v0.1.0
|
|
[0.1.0]: ../../compare/v0.1.0...v0.0.11
|
|
[0.0.11]: ../../compare/v0.0.11...v0.0.10
|
|
[0.0.10]: ../../compare/v0.0.10...v0.0.9
|
|
[0.0.9]: ../../compare/v0.0.9...v0.0.8
|
|
[0.0.8]: ../../compare/v0.0.8...v0.0.7
|
|
[0.0.7]: ../../compare/v0.0.7...v0.0.6
|
|
[0.0.6]: ../../compare/v0.0.6...v0.0.5
|
|
[0.0.5]: ../../compare/v0.0.5...v0.0.4
|
|
[0.0.4]: ../../compare/v0.0.4...v0.0.3
|
|
[0.0.3]: ../../compare/v0.0.3...v0.0.2
|
|
[0.0.2]: ../../compare/v0.0.2...v0.0.1
|
|
[0.0.1]: ../../compare/HEAD...v0.0.1
|