Commit Graph

27 Commits (529f2c0e19cd99da9735be6321da06657954d355)

Author SHA1 Message Date
vahid 583cdcee2c new (hybrid cnn+transformer) textline model which can accelerate to extract contour textlines faster 2 years ago
Gerber, Mike 11d9b00510 🧹 Don't produce spurious TextEquiv elements.
eynollah produces spurious - and empy - pcGts TextEquiv elements. This
is a. unnecessary, b. wrong and c. produces a lot of warning messages
in subsequent OCR processing steps because the OCR processor warns
about already existing text.

Fix this by not generating any TextEquiv elements.

Fixes gh-37.
3 years ago
vahid c67e155431 table detection completed, enhanced images can be now written to output 3 years ago
vahid a5c940705a tables are integrated 3 years ago
vahid 4560738427 #45 fixed 3 years ago
vahid 43c9302390 fixed #40 and separators are also written in xml 3 years ago
vahid 7cbecadccc adding the binarization model and option to binarize input document for the cases like dark, stronly bright and other ones 3 years ago
Konstantin Baierer 4897cefdb7 allow passing PIL image to Eynollah w/o disk I/O 4 years ago
vahidrezanezhad 037210b292
update writer.py 4 years ago
Konstantin Baierer b8d818ede1 writer: don't create empty PcGts at init 4 years ago
Konstantin Baierer 8c4e9b6068 allow passing pcgts to eynollah and writer 4 years ago
Konstantin Baierer c95529725a 🐛 typo type{,_} 4 years ago
Konstantin Baierer 93f93444ee 🐛 typo {c,C}oords 4 years ago
Konstantin Baierer 416a84e542 replace lxml with OCR-D/core PAGE API 4 years ago
vahidrezanezhad d5a9817390
back on track- freezing problem , memory error and issues with reading order by drop capitals and marginals are resolved 4 years ago
Konstantin Baierer 3d9da4feaa writer: use a single counter for all regions/lines 4 years ago
Konstantin Baierer a3465ca1a0 eliminate id_of_texts from xml_reading_order, fix plus one error 4 years ago
Konstantin Baierer c5736e9b74 fix region counting 4 years ago
Konstantin Baierer 03d75f5788 simplify serialize_lines_in_region 4 years ago
Konstantin Baierer d95fcf14c0 id_of_marginalia still necessary 4 years ago
Konstantin Baierer 7eb973b3aa xml_reading_order takes id_of_marginals directly 4 years ago
Konstantin Baierer 9b1da7c023 use counter for lines too 4 years ago
Konstantin Baierer 1cd3ee1a2e simplify calculate_polygon_coords 4 years ago
Konstantin Baierer 20fcac6232 remove unnecessary if 4 years ago
Konstantin Baierer 24da879844 add EynollahIdCounter class 4 years ago
Konstantin Baierer 9f5e4af5f0 factor out marginalia ID calc from xml_reading_order 4 years ago
Konstantin Baierer 58c4403e13 rename package to qurator.eynollah 4 years ago