Commit Graph

168 Commits (main)

Author SHA1 Message Date
vahidrezanezhad ae7c424889
Update eynollah.py 2 years ago
vahid cd9920eea7 extracting page 2 years ago
vahid 735abc43f3 option to ignore page extraction 2 years ago
cneud 934bbd5892 cleanup 2 years ago
cneud ecf117ca95 adapt to tf1.compat session mode in tf2 2 years ago
Clemens Neudecker 568391ec4a
require model command line option (fix #59) (#73) 2 years ago
vahid 3bbbeecfec all options are enabled for light version 2 years ago
Gerber, Mike f27ac155ae 🧹 Downgrade "Patch size" log message to debug
Fixes gh-55.
2 years ago
vahid adf10942fa issue #55 resolved 2 years ago
vahid 2eacb9a8ec renaming the models 2 years ago
vahid c606391c31 flow from directory 2 years ago
vahid cf5ef8f5ae light version as option 2 years ago
vahidrezanezhad b8a532180a light version integration 2 years ago
vahidrezanezhad 10f1acef29
Merge pull request #65 from mikegerber/fix/enhanced-message
Fix/enhanced message
2 years ago
vahidrezanezhad c30d4d5c30
Merge pull request #64 from mikegerber/feat/better-time-msgs
💄 Improve timing messages (Fixes #62)
2 years ago
Gerber, Mike 11d9b00510 🧹 Don't produce spurious TextEquiv elements.
eynollah produces spurious - and empy - pcGts TextEquiv elements. This
is a. unnecessary, b. wrong and c. produces a lot of warning messages
in subsequent OCR processing steps because the OCR processor warns
about already existing text.

Fix this by not generating any TextEquiv elements.

Fixes gh-37.
2 years ago
Gerber, Mike 1fe8f92afc 🐛 Clarify message if an image was enhanced 2 years ago
Gerber, Mike 7ccd7663e1 💄 Improve more timing messages 2 years ago
Gerber, Mike cdea0acffe 💄 Improve timing messages (Fixes #62) 2 years ago
Konstantin Baierer f0ac0bb090 📦 v0.0.11 2 years ago
Konstantin Baierer d75803b11d ocrd-tool: "models" parameter is a directory 2 years ago
Konstantin Baierer e769f625fe 📦 v0.0.10 3 years ago
Konstantin Baierer 09d85bee87 Merge remote-tracking branch 'vahidrezanezhad/main' into main 3 years ago
vahidrezanezhad 169b50aaaf fixed: empty page error due None table contours 3 years ago
Konstantin Baierer 0e63ebcbe5 📦 v0.0.9 3 years ago
Konstantin Baierer 4223fed628 Merge remote-tracking branch 'vahidrezanezhad/main' into main 3 years ago
Konstantin Baierer e7868b9851 📦 v0.0.8 3 years ago
Konstantin Baierer 5124a60527 set pcGtsId before adding file to mets 3 years ago
vahid 0859d22f4c modifications 3 years ago
vahid 14c588e162 resolving an issue 3 years ago
vahid 254abf4d3d more modifications for tables 3 years ago
vahid b3b49272a5 README is updated 3 years ago
vahid c67e155431 table detection completed, enhanced images can be now written to output 3 years ago
vahid a5c940705a tables are integrated 3 years ago
vahid 80b17af40c #47 fixed 3 years ago
Konstantin Baierer d784202ae1 📦 v0.0.7 3 years ago
Konstantin Baierer 6b810eb682 Merge remote-tracking branch 'vahidrezanezhad/main' into main 3 years ago
vahid 4560738427 #45 fixed 3 years ago
Konstantin Baierer efc146feb8 📦 v0.0.6 3 years ago
vahid becb0c1329 trivial 3 years ago
vahid 059905c9e4 #43 empty textlines caused by newer python-opencv, is resolved 3 years ago
vahid d1330ffb80 #43 resolved 3 years ago
Konstantin Baierer 80795c9e6c 📦 v0.0.5 3 years ago
Konstantin Baierer 45939abdff OCR-D CLI: remove allow_enhancement parameter
It does not toggle enhancement (eynollah does that internally anyway)
but setting it to true will base the coordinate calculations on that
enhanced (different-sized) image instead of the original. That is never
sensible in the OCR-D context.
3 years ago
Konstantin Baierer 5d2fe79822 📦 v0.0.4 3 years ago
vahid 43c9302390 fixed #40 and separators are also written in xml 3 years ago
Konstantin Baierer fce7cdfd8b 📦 v0.0.3 3 years ago
vahid aa2e91641a Merge branch 'main' of https://github.com/qurator-spk/eynollah into main 3 years ago
vahid 799a7c7632 fixed #38 3 years ago
Konstantin Baierer 26283c6a3b 📦 v0.0.2 3 years ago
vahid c4b2c71e68 resolving issue https://github.com/qurator-spk/eynollah/issues/38 3 years ago
vahid 7cbecadccc adding the binarization model and option to binarize input document for the cases like dark, stronly bright and other ones 3 years ago
vahid 44dad6a072 strong erosion, more modification 3 years ago
vahidrezanezhad 176c7531ab
Update eynollah.py 3 years ago
vahid c051e22432 fixing again the error raised because of erosion 3 years ago
vahidrezanezhad d5be8aece3
Merge pull request #33 from qurator-spk/ocrd-cli
Ocrd cli
3 years ago
Konstantin Baierer 6c8852eb04 check_dpi: catch Pillow choking on faulty img, return 230 3 years ago
Konstantin Baierer ff265eee5c cv2pil: do COLOR_BGR2RGB conversion 3 years ago
Konstantin Baierer c7f304dcb6 ocrd processor: pass local filename as image_filename, ht @bertsky 3 years ago
Konstantin Baierer d0b0e23ac6 do DPI calculation as part of caching images 3 years ago
Konstantin Baierer ae0b4a825a ocrd cli: catch dpi == 1, return 230 3 years ago
Konstantin Baierer 2e8a3e3bee use Page.imageFilename directly for accurate DPI estimate 3 years ago
Konstantin Baierer 42ccb4711d
Update qurator/eynollah/ocrd-tool.json
Co-authored-by: Robert Sachunsky <38561704+bertsky@users.noreply.github.com>
3 years ago
vahid 1184d3d2fc issue raised by Clemens, strong erosion causing 3 years ago
Konstantin Baierer 4897cefdb7 allow passing PIL image to Eynollah w/o disk I/O 3 years ago
Konstantin Baierer d40c453dad
check_dpi: raise exception if resolution == 1 to trigger except clause
Co-authored-by: Robert Sachunsky <38561704+bertsky@users.noreply.github.com>
3 years ago
Konstantin Baierer 1367f82605
improve ocrd-tool descriptions
Co-authored-by: Robert Sachunsky <38561704+bertsky@users.noreply.github.com>
3 years ago
vahidrezanezhad 037210b292
update writer.py 3 years ago
Konstantin Baierer b8d818ede1 writer: don't create empty PcGts at init 3 years ago
Konstantin Baierer 8c4e9b6068 allow passing pcgts to eynollah and writer 3 years ago
Konstantin Baierer 2bc34891a5 fix CLI call 3 years ago
vahid 98f9272c4b a trivial issue is resolved 3 years ago
Konstantin Baierer 9db6edf51e OCR-D CLI 3 years ago
Konstantin Baierer 1715f0d8b3 allow overriding DPI 3 years ago
Konstantin Baierer c95529725a 🐛 typo type{,_} 3 years ago
Konstantin Baierer 93f93444ee 🐛 typo {c,C}oords 3 years ago
Konstantin Baierer 416a84e542 replace lxml with OCR-D/core PAGE API 3 years ago
vahidrezanezhad 7a859ffae4
Merge branch 'main' into xml-rfct 3 years ago
vahidrezanezhad d5a9817390
back on track- freezing problem , memory error and issues with reading order by drop capitals and marginals are resolved 3 years ago
vahidrezanezhad 43b8759acf
back on track- freezing problem , memory error and issues with reading order by drop capitals and marginals are resolved 3 years ago
vahid b473c85a59 OOM error happend with tensorflow-gpu=1.15.5 is resolved 3 years ago
Konstantin Baierer 3d9da4feaa writer: use a single counter for all regions/lines 3 years ago
Konstantin Baierer a678bbf966 counter: add reset(); 3 years ago
Konstantin Baierer a3465ca1a0 eliminate id_of_texts from xml_reading_order, fix plus one error 3 years ago
Konstantin Baierer 6c60d9e90a reading order: fix @index 3 years ago
Konstantin Baierer 02aa31cc66 Merge remote-tracking branch 'origin/main' into xml-rfct 3 years ago
Konstantin Baierer c5736e9b74 fix region counting 3 years ago
vahid 67a9fc8820 .. 3 years ago
vahidrezanezhad 4b3c8a6707
bug in reading order is fixed 3 years ago
vahidrezanezhad 73b7c780ab
Update eynollah.py
reading order bug for documents with text regions less than 5: fixed
3 years ago
Konstantin Baierer 03d75f5788 simplify serialize_lines_in_region 3 years ago
Konstantin Baierer d95fcf14c0 id_of_marginalia still necessary 3 years ago
Konstantin Baierer 56b688befe counter: allow arbitrary line/region id 3 years ago
Konstantin Baierer 7eb973b3aa xml_reading_order takes id_of_marginals directly 3 years ago
Konstantin Baierer 98568402c7 counter: init-overrideable 3 years ago
Konstantin Baierer 9b1da7c023 use counter for lines too 3 years ago
Konstantin Baierer 1cd3ee1a2e simplify calculate_polygon_coords 3 years ago
Konstantin Baierer 20fcac6232 remove unnecessary if 3 years ago
Konstantin Baierer 24da879844 add EynollahIdCounter class 3 years ago
Konstantin Baierer 9f5e4af5f0 factor out marginalia ID calc from xml_reading_order 3 years ago