Commit Graph

169 Commits (7ded54a8d21b14fff3c4d048a33710910476b834)

Author SHA1 Message Date
vahid 01bfc3914d extracting page as an option
vahidrezanezhad ae7c424889
Update eynollah.py
vahid cd9920eea7 extracting page
vahid 735abc43f3 option to ignore page extraction
cneud 934bbd5892 cleanup
cneud ecf117ca95 adapt to tf1.compat session mode in tf2
Clemens Neudecker 568391ec4a
require model command line option (fix ) ()
vahid 3bbbeecfec all options are enabled for light version
Gerber, Mike f27ac155ae 🧹 Downgrade "Patch size" log message to debug
Fixes gh-55.
vahid adf10942fa issue resolved
vahid 2eacb9a8ec renaming the models
vahid c606391c31 flow from directory
vahid cf5ef8f5ae light version as option
vahidrezanezhad b8a532180a light version integration
vahidrezanezhad 10f1acef29
Merge pull request from mikegerber/fix/enhanced-message
Fix/enhanced message
vahidrezanezhad c30d4d5c30
Merge pull request from mikegerber/feat/better-time-msgs
💄 Improve timing messages (Fixes )
Gerber, Mike 11d9b00510 🧹 Don't produce spurious TextEquiv elements.
eynollah produces spurious - and empy - pcGts TextEquiv elements. This
is a. unnecessary, b. wrong and c. produces a lot of warning messages
in subsequent OCR processing steps because the OCR processor warns
about already existing text.

Fix this by not generating any TextEquiv elements.

Fixes gh-37.
Gerber, Mike 1fe8f92afc 🐛 Clarify message if an image was enhanced
Gerber, Mike 7ccd7663e1 💄 Improve more timing messages
Gerber, Mike cdea0acffe 💄 Improve timing messages (Fixes )
Konstantin Baierer f0ac0bb090 📦 v0.0.11
Konstantin Baierer d75803b11d ocrd-tool: "models" parameter is a directory
Konstantin Baierer e769f625fe 📦 v0.0.10
Konstantin Baierer 09d85bee87 Merge remote-tracking branch 'vahidrezanezhad/main' into main
vahidrezanezhad 169b50aaaf fixed: empty page error due None table contours
Konstantin Baierer 0e63ebcbe5 📦 v0.0.9
Konstantin Baierer 4223fed628 Merge remote-tracking branch 'vahidrezanezhad/main' into main
Konstantin Baierer e7868b9851 📦 v0.0.8
Konstantin Baierer 5124a60527 set pcGtsId before adding file to mets
vahid 0859d22f4c modifications
vahid 14c588e162 resolving an issue
vahid 254abf4d3d more modifications for tables
vahid b3b49272a5 README is updated
vahid c67e155431 table detection completed, enhanced images can be now written to output
vahid a5c940705a tables are integrated
vahid 80b17af40c fixed
Konstantin Baierer d784202ae1 📦 v0.0.7
Konstantin Baierer 6b810eb682 Merge remote-tracking branch 'vahidrezanezhad/main' into main
vahid 4560738427 fixed
Konstantin Baierer efc146feb8 📦 v0.0.6
vahid becb0c1329 trivial
vahid 059905c9e4 empty textlines caused by newer python-opencv, is resolved
vahid d1330ffb80 resolved
Konstantin Baierer 80795c9e6c 📦 v0.0.5
Konstantin Baierer 45939abdff OCR-D CLI: remove allow_enhancement parameter
It does not toggle enhancement (eynollah does that internally anyway)
but setting it to true will base the coordinate calculations on that
enhanced (different-sized) image instead of the original. That is never
sensible in the OCR-D context.
Konstantin Baierer 5d2fe79822 📦 v0.0.4
vahid 43c9302390 fixed and separators are also written in xml
Konstantin Baierer fce7cdfd8b 📦 v0.0.3
vahid aa2e91641a Merge branch 'main' of https://github.com/qurator-spk/eynollah into main
vahid 799a7c7632 fixed
Konstantin Baierer 26283c6a3b 📦 v0.0.2
vahid c4b2c71e68 resolving issue https://github.com/qurator-spk/eynollah/issues/38
vahid 7cbecadccc adding the binarization model and option to binarize input document for the cases like dark, stronly bright and other ones
vahid 44dad6a072 strong erosion, more modification
vahidrezanezhad 176c7531ab
Update eynollah.py
vahid c051e22432 fixing again the error raised because of erosion
vahidrezanezhad d5be8aece3
Merge pull request from qurator-spk/ocrd-cli
Ocrd cli
Konstantin Baierer 6c8852eb04 check_dpi: catch Pillow choking on faulty img, return 230
Konstantin Baierer ff265eee5c cv2pil: do COLOR_BGR2RGB conversion
Konstantin Baierer c7f304dcb6 ocrd processor: pass local filename as image_filename, ht @bertsky
Konstantin Baierer d0b0e23ac6 do DPI calculation as part of caching images
Konstantin Baierer ae0b4a825a ocrd cli: catch dpi == 1, return 230
Konstantin Baierer 2e8a3e3bee use Page.imageFilename directly for accurate DPI estimate
Konstantin Baierer 42ccb4711d
Update qurator/eynollah/ocrd-tool.json
Co-authored-by: Robert Sachunsky <38561704+bertsky@users.noreply.github.com>
vahid 1184d3d2fc issue raised by Clemens, strong erosion causing
Konstantin Baierer 4897cefdb7 allow passing PIL image to Eynollah w/o disk I/O
Konstantin Baierer d40c453dad
check_dpi: raise exception if resolution == 1 to trigger except clause
Co-authored-by: Robert Sachunsky <38561704+bertsky@users.noreply.github.com>
Konstantin Baierer 1367f82605
improve ocrd-tool descriptions
Co-authored-by: Robert Sachunsky <38561704+bertsky@users.noreply.github.com>
vahidrezanezhad 037210b292
update writer.py
Konstantin Baierer b8d818ede1 writer: don't create empty PcGts at init
Konstantin Baierer 8c4e9b6068 allow passing pcgts to eynollah and writer
Konstantin Baierer 2bc34891a5 fix CLI call
vahid 98f9272c4b a trivial issue is resolved
Konstantin Baierer 9db6edf51e OCR-D CLI
Konstantin Baierer 1715f0d8b3 allow overriding DPI
Konstantin Baierer c95529725a 🐛 typo type{,_}
Konstantin Baierer 93f93444ee 🐛 typo {c,C}oords
Konstantin Baierer 416a84e542 replace lxml with OCR-D/core PAGE API
vahidrezanezhad 7a859ffae4
Merge branch 'main' into xml-rfct
vahidrezanezhad d5a9817390
back on track- freezing problem , memory error and issues with reading order by drop capitals and marginals are resolved
vahidrezanezhad 43b8759acf
back on track- freezing problem , memory error and issues with reading order by drop capitals and marginals are resolved
vahid b473c85a59 OOM error happend with tensorflow-gpu=1.15.5 is resolved
Konstantin Baierer 3d9da4feaa writer: use a single counter for all regions/lines
Konstantin Baierer a678bbf966 counter: add reset();
Konstantin Baierer a3465ca1a0 eliminate id_of_texts from xml_reading_order, fix plus one error
Konstantin Baierer 6c60d9e90a reading order: fix @index
Konstantin Baierer 02aa31cc66 Merge remote-tracking branch 'origin/main' into xml-rfct
Konstantin Baierer c5736e9b74 fix region counting
vahid 67a9fc8820 ..
vahidrezanezhad 4b3c8a6707
bug in reading order is fixed
vahidrezanezhad 73b7c780ab
Update eynollah.py
reading order bug for documents with text regions less than 5: fixed
Konstantin Baierer 03d75f5788 simplify serialize_lines_in_region
Konstantin Baierer d95fcf14c0 id_of_marginalia still necessary
Konstantin Baierer 56b688befe counter: allow arbitrary line/region id
Konstantin Baierer 7eb973b3aa xml_reading_order takes id_of_marginals directly
Konstantin Baierer 98568402c7 counter: init-overrideable
Konstantin Baierer 9b1da7c023 use counter for lines too
Konstantin Baierer 1cd3ee1a2e simplify calculate_polygon_coords
Konstantin Baierer 20fcac6232 remove unnecessary if
Konstantin Baierer 24da879844 add EynollahIdCounter class