eynollah

mirror of https://github.com/qurator-spk/eynollah.git synced 2025-09-01 13:29:58 +02:00

Author	SHA1	Message	Date
vahidrezanezhad	87938fe42b	Merge `fdcae8dd6e` into `a2359ea4c4`	2025-08-28 11:37:36 +02:00
vahidrezanezhad	fdcae8dd6e	eynollah ocr: support using either a specific model name or a models directory (default model)	2025-08-28 11:30:59 +02:00
vahidrezanezhad	7dd281267d	Marginals are divided into left and right, and written from top to bottom.	2025-08-26 22:38:03 +02:00
vahidrezanezhad	8dc2fab9fa	reading order on given layout	2025-08-18 02:31:13 +02:00
Clemens Neudecker	a2359ea4c4	Merge pull request #171 from bertsky/ocrd-machine-based-ro OCR-D processor: expose reading_order_machine_based	2025-08-15 18:40:13 +02:00
Robert Sachunsky	21615a986d	OCR-D processor: expose reading_order_machine_based	2025-08-13 14:14:37 +02:00
vahidrezanezhad	20614d1678	avoiding float in range	2025-08-12 12:50:15 +02:00
vahidrezanezhad	5db3e9fa64	deskewing with faster multiprocessing	2025-08-08 11:32:02 +02:00
vahidrezanezhad	a0c19c57be	use the latest ocr model with balanced fraktur-antiqua training dataset	2025-08-05 14:22:22 +02:00
vahidrezanezhad	0803881f36	threshold for textline ocr + new ocr model	2025-07-25 13:18:38 +02:00
vahidrezanezhad	6b8893b188	Merge pull request #167 from qurator-spk/ocrd-fixes Ocrd fixes	2025-07-22 14:46:25 +02:00
vahidrezanezhad	d968a306e4	should merged text for the whole page be written in xml?	2025-07-21 14:50:05 +02:00
vahidrezanezhad	920705c3b1	update model names	2025-07-21 10:54:20 +02:00
vahidrezanezhad	e0f4a007e4	ocr model renamed - image text font for ocr result is now using Charis-7.000 font (downloaded from here https://software.sil.org/charis/download/)	2025-07-16 14:00:12 +02:00
vahidrezanezhad	e54ebaa23e	ocr: make sure that image height or width is not zero	2025-07-03 15:24:52 +02:00
vahidrezanezhad	59ea493803	decorated with confidence value for cnnrnn ocr model	2025-07-03 11:50:47 +02:00
kba	b7b218ff11	OCR-D processor: same behavior as standalone wrt light_version/textline_light	2025-06-12 15:30:17 +02:00
vahidrezanezhad	c194a20c9c	Fixed duplicate textline_light assignments (true and false) in the OCR-D framework for the Eynollah light version, which caused rectangles to be used instead of contours for textlines	2025-06-12 15:27:22 +02:00
vahidrezanezhad	065f1f9a93	Fix: Resolved OCR bug when text region type is undefined	2025-06-02 18:21:33 +02:00
vahidrezanezhad	7996afac69	image enhancer updated	2025-06-01 22:44:50 +02:00
vahidrezanezhad	d14bd162ca	saving enhanced image in org or scaled resolution	2025-06-01 22:10:13 +02:00
vahidrezanezhad	cc36694dfd	image enhancer is integrated	2025-06-01 15:53:04 +02:00
vahidrezanezhad	928a548b70	Parametrize OCR for handling curved lines	2025-05-31 01:09:14 +02:00
vahidrezanezhad	48285ce3f5	updating ocr	2025-05-28 01:17:21 +02:00
vahidrezanezhad	b93fc112bf	updating ocr	2025-05-27 23:45:22 +02:00
vahidrezanezhad	0f154c605a	strings alignment function is added + new changes needed for prediction with both bin and rgb inputs is implemented	2025-05-25 21:44:36 +02:00
vahidrezanezhad	097520bfd2	rnn ocr for all layout textregion types	2025-05-25 03:33:54 +02:00
vahidrezanezhad	27c4b0d0e0	Drop capitals are written separately and are not attached to their corresponding text line. The OCR use case also supports single-image input.	2025-05-25 01:12:58 +02:00
vahidrezanezhad	adcf03c7b7	enhancing ocr	2025-05-23 18:06:53 +02:00
vahidrezanezhad	d4f6e10251	commit `21ec4fb` is picked + rnn ocr at the same time with segmentation + enhancement of mb reading order	2025-05-23 15:55:03 +02:00
vahidrezanezhad	a0647eff93	enhancing curved lines OCR	2025-05-21 17:42:44 +02:00
vahidrezanezhad	f94fc9973b	Implement hyphenated textline merging in OCR engine and a bug fixed for curved textline OCR	2025-05-21 14:39:31 +02:00
vahidrezanezhad	c0835665a9	ocr for curved lines	2025-05-20 19:01:52 +02:00
vahidrezanezhad	848156dd9d	mb reading order now can be done faster. Text regions are clustered using dilation, and mb reading order needs to be implemented for fewer regions	2025-05-20 16:51:08 +02:00
vahidrezanezhad	7a34bbb493	enhancing marginal detection for light version	2025-05-18 02:48:05 +02:00
vahidrezanezhad	0819730355	marginals detection enhanced for light version	2025-05-15 15:33:50 +02:00
vahidrezanezhad	adee1dc55c	enhancement for vertical textlines	2025-05-15 00:45:22 +02:00
vahidrezanezhad	a9cdd56e9a	enhance ocr for vertical textlines	2025-05-14 18:34:58 +02:00
vahidrezanezhad	1ccd3fb7cf	Accurately writing text line contours into xml file when the deskewing exceeds 45 degrees and the text line is in light mode	2025-05-13 15:53:05 +02:00
vahidrezanezhad	07f5b52fa7	The initial attempt at reading heavily deskewed or vertically aligned lines.	2025-05-13 14:40:57 +02:00
vahidrezanezhad	02a679a145	I have tried to address the issues #163 and #161 . The changes have also improved marginal detection and enhanced the isolation of headers.	2025-05-12 00:10:18 +02:00
Clemens Neudecker	3dcbb20cac	Merge pull request #159 from bertsky/main update docker	2025-05-06 15:14:06 +02:00
vahidrezanezhad	5d447abcc4	let to add dataset abbrevation to extracted textline images and text	2025-05-03 02:59:16 +02:00
vahidrezanezhad	8c8fa461bb	machine based model name changed to public one	2025-05-02 12:57:26 +02:00
vahidrezanezhad	a4defbb04d	inference batch size for ocr is passed as an argument	2025-05-02 12:53:33 +02:00
vahidrezanezhad	fd375e15d5	adding space between splitted textline predicted text in the case of trocr	2025-05-02 01:02:32 +02:00
vahidrezanezhad	5c8084a397	displaying detexted text on an image is provided for trocr case	2025-05-02 00:30:36 +02:00
Robert Sachunsky	e9179e1d34	docker: use latest core base stage	2025-05-02 00:16:22 +02:00
Robert Sachunsky	f8b4d29a59	docker: prepackage ocrd-all-module-dir.json	2025-05-02 00:16:22 +02:00
vahidrezanezhad	e2da7a6239	Fix model name to return the correct machine-based model name	2025-04-30 16:06:29 +02:00

1 2 3 4 5 ...

785 commits