eynollah

mirror of https://github.com/qurator-spk/eynollah.git synced 2026-06-02 19:19:16 +02:00

Author	SHA1	Message	Date
Robert Sachunsky	90f1d7aa47	rm summary msg (info already logged elsewhere)	2025-09-24 12:10:11 +02:00
Robert Sachunsky	7933b103f5	log modes only once (in run, not in run_single)	2025-09-24 12:09:30 +02:00
Robert Sachunsky	d0817f5744	fix typo	2025-09-24 12:08:50 +02:00
kba	9ead58b99a	Merge remote-tracking branch 'michalbubula/add-feedback' into prepare-release-v0.5.0	2025-09-23 19:50:27 +02:00
kba	7bde99e866	Merge remote-tracking branch 'origin/updating_readme_for_eynollah_use_cases' into prepare-release-v0.5.0	2025-09-23 19:42:55 +02:00
kba	df8d93dbfa	Merge branch 'main' into add-feedback	2025-09-23 19:20:20 +02:00
vahidrezanezhad	a65405bead	tables are visulaized within layout	2025-09-22 15:56:14 +02:00
vahidrezanezhad	554f3988c9	default cnn-rnn and transformer ocr models have changed to model_eynollah_ocr_cnnrnn_20250904 and model_eynollah_ocr_trocr_20250919 respectively	2025-09-21 16:33:14 +02:00
vahidrezanezhad	6bbdfe1074	extending image types	2025-09-21 02:32:40 +02:00
vahidrezanezhad	e97e3ab192	Merge text of textlines and handle hyphenated words by joining them correctly	2025-09-19 23:23:30 +02:00
vahidrezanezhad	b38331b4ab	writing page contour correctly in xml output + ignore unsupported file types when loading images	2025-09-19 18:06:18 +02:00
vahidrezanezhad	994bc8a1c0	debug new page extraction in the case of ignoring page extraction	2025-09-19 15:24:34 +02:00
vahidrezanezhad	530897c6c2	renaming argument names	2025-09-19 13:20:26 +02:00
kba	5c9cf8472b	remove redundant/brittle interval logging	2025-09-18 13:19:57 +02:00
kba	146102842a	convert all print stmts to logger.info calls	2025-09-18 13:15:18 +02:00
kba	c64d102613	move logging to CLI and make initialization optional	2025-09-18 13:07:41 +02:00
vahidrezanezhad	310679eeb8	page extraction model name is changed	2025-09-16 14:27:15 +02:00
vahidrezanezhad	542646791d	For TrOCR, the cropped text lines will no longer be added to a list before prediction. Instead, for each batch size, the text line images will be collected and predictions will be made directly on them.	2025-09-23 19:03:13 +02:00
vahidrezanezhad	68a71be8bc	Running inference on files in a directory	2025-09-13 22:40:11 +02:00
vahidrezanezhad	0711166524	changed the drop capitals bonding box to contour ratio threshold	2025-09-01 11:37:22 +02:00
vahidrezanezhad	e15640aa8a	new page extraction model integration	2025-09-15 13:36:58 +02:00
vahidrezanezhad	6a735daa60	Update README.md	2025-08-31 23:30:54 +02:00
vahidrezanezhad	9b9d21d8ac	eynollah ocr: support using either a specific model name or a models directory (default model)	2025-08-28 11:30:59 +02:00
vahidrezanezhad	41365645ef	Marginals are divided into left and right, and written from top to bottom.	2025-08-26 22:38:03 +02:00
vahidrezanezhad	7741502876	reading order on given layout	2025-08-18 02:31:13 +02:00
Clemens Neudecker	a2359ea4c4	Merge pull request #171 from bertsky/ocrd-machine-based-ro OCR-D processor: expose reading_order_machine_based	2025-08-15 18:40:13 +02:00
Robert Sachunsky	21615a986d	OCR-D processor: expose reading_order_machine_based	2025-08-13 14:14:37 +02:00
michalbubula	8ebba5ac04	add feedback to command line interface	2025-08-12 16:21:15 +02:00
vahidrezanezhad	268aa141d7	avoiding float in range	2025-08-12 12:50:15 +02:00
vahidrezanezhad	cf4983da54	visualize vertical ocr text vertically	2025-08-08 16:12:55 +02:00
vahidrezanezhad	52d9cc9baf	deskewing with faster multiprocessing	2025-08-08 11:32:02 +02:00
vahidrezanezhad	263da755ef	loading xmls with UTF-8 encoding	2025-08-07 10:32:49 +02:00
vahidrezanezhad	6462ea5b33	adding visualization of ocr text of xml file	2025-08-06 22:33:42 +02:00
vahidrezanezhad	322b04145f	use the latest ocr model with balanced fraktur-antiqua training dataset	2025-08-05 14:22:22 +02:00
vahidrezanezhad	1b95f8f38d	threshold for textline ocr + new ocr model	2025-07-25 13:18:38 +02:00
Clemens Neudecker	2996fc8b30	Merge pull request #166 from qurator-spk/updating_readme_for_eynollah_use_cases-cli Updating readme for eynollah use cases cli	2025-07-24 15:30:57 +02:00
vahidrezanezhad	fd0595f920	Update Makefile	2025-07-24 13:52:38 +02:00
vahidrezanezhad	da141bb42e	resolving tests error	2025-07-23 16:44:17 +02:00
vahidrezanezhad	6b8893b188	Merge pull request #167 from qurator-spk/ocrd-fixes Ocrd fixes	2025-07-22 14:46:25 +02:00
vahidrezanezhad	daa597dbaa	should merged text for the whole page be written in xml?	2025-07-21 14:50:05 +02:00
vahidrezanezhad	673e67a847	update model names	2025-07-21 10:54:20 +02:00
vahidrezanezhad	fee40049cd	ocr model renamed - image text font for ocr result is now using Charis-7.000 font (downloaded from here https://software.sil.org/charis/download/)	2025-07-16 14:00:12 +02:00
vahidrezanezhad	04fead348f	ocr: make sure that image height or width is not zero	2025-07-03 15:24:52 +02:00
vahidrezanezhad	53dd4b26a9	decorated with confidence value for cnnrnn ocr model	2025-07-03 11:50:47 +02:00
vahidrezanezhad	1b222594d6	Update README.md: how to train model using docker image	2025-06-25 18:33:55 +02:00
vahidrezanezhad	f5a1d1a255	docker file to train model with desired cuda and cudnn	2025-06-25 18:24:16 +02:00
kba	b7b218ff11	OCR-D processor: same behavior as standalone wrt light_version/textline_light	2025-06-12 15:30:17 +02:00
vahidrezanezhad	c194a20c9c	Fixed duplicate textline_light assignments (true and false) in the OCR-D framework for the Eynollah light version, which caused rectangles to be used instead of contours for textlines	2025-06-12 15:27:22 +02:00
kba	32889ef1e0	adapt binarization CLI according to #156	2025-06-12 13:57:41 +02:00
vahidrezanezhad	9b4e78c55c	Fixed duplicate textline_light assignments (true and false) in the OCR-D framework for the Eynollah light version, which caused rectangles to be used instead of contours for textlines	2025-06-11 18:57:08 +02:00

... 3 4 5 6 7 ...

1144 commits