eynollah

mirror of https://github.com/qurator-spk/eynollah.git synced 2025-12-18 00:54:14 +01:00

Author	SHA1	Message	Date
vahidrezanezhad	54d9916f3b	page extraction model name is changed	2025-09-16 14:27:15 +02:00
vahidrezanezhad	52cb0d9fac	new page extraction model integration	2025-09-15 13:38:23 +02:00
vahidrezanezhad	6e008345a0	new page extraction model integration	2025-09-15 13:36:58 +02:00
vahidrezanezhad	8c949cec71	PR #173 has been reverted. Additionally, for TrOCR, the cropped text lines will no longer be added to a list before prediction. Instead, for each batch size, the text line images will be collected and predictions will be made directly on them.	2025-09-03 19:18:11 +02:00
vahidrezanezhad	d9ae7bd12c	merged pr #173 in #175	2025-09-02 15:27:19 +02:00
Robert Sachunsky	b84d945b5a	Merge pull request #3 from bertsky/polygon-dilate-buffer-refactor2 some refactoring (second attempt)...	2025-09-02 13:26:52 +02:00
vahidrezanezhad	92a7c7cfea	changed the drop capitals bonding box to contour ratio threshold	2025-09-01 11:37:22 +02:00
Robert Sachunsky	090341241e	writer: use @type='heading' instead of 'header'	2025-08-29 17:21:30 +02:00
Robert Sachunsky	bb9cba1fd9	writer: SeparatorRegion needs SeparatorRegionType (not ImageRegionType)	2025-08-29 17:21:30 +02:00
Robert Sachunsky	eae1303ebb	contours: rename 'pixel' → 'label' for clarity	2025-08-29 17:21:30 +02:00
Robert Sachunsky	dbbf1073df	avoid pulling unused 'image_page_rotated' through functions	2025-08-29 17:21:30 +02:00
Robert Sachunsky	142ac8825e	use box2rect instead of crop_image_inside_box when no image needed	2025-08-29 17:21:30 +02:00
Robert Sachunsky	892ff41e38	utils: introduce box2rect and box2slice	2025-08-29 17:21:30 +02:00
Robert Sachunsky	d3566e55ef	polygon2contour: fix `698f38e4` (deprecated dtype)	2025-08-29 17:21:08 +02:00
Robert Sachunsky	741aa7867c	get_marginals: exit early if no peaks found to avoid spurious overlap mask	2025-08-29 12:46:19 +02:00
Robert Sachunsky	57821662b9	filter_contours_without_textline_inside: avoid removing from identical lists twice	2025-08-29 12:46:12 +02:00
Robert Sachunsky	698f38e461	polygon2contour: avoid overflow	2025-08-29 12:43:46 +02:00
vahidrezanezhad	fdcae8dd6e	eynollah ocr: support using either a specific model name or a models directory (default model)	2025-08-28 11:30:59 +02:00
vahidrezanezhad	7dd281267d	Marginals are divided into left and right, and written from top to bottom.	2025-08-26 22:38:03 +02:00
Robert Sachunsky	fd6a6495a2	increase dilatation: textregions/lines (5→6), seplines (0→1)	2025-08-21 13:00:31 +02:00
Robert Sachunsky	8be52fb143	refactor shapely converisons into contour2polygon / polygon2contour, also handle heterogeneous geometries	2025-08-21 12:59:03 +02:00
Robert Sachunsky	8b5f90e243	move dilate_*_contours to .utils.contour, rename dilate_textregions_contours_textline_version → dilate_textline_contours	2025-08-21 01:42:46 +02:00
Robert Sachunsky	244772f086	filter_contours_area_of_image*: also ensure validity here	2025-08-21 01:33:16 +02:00
Robert Sachunsky	42474afa4b	rename lines_xml → seplines for clarity	2025-08-21 01:32:32 +02:00
Robert Sachunsky	b610fe07a6	check_any_text_region_in_model_one_is_main_or_header_light: return original instead of resampled contours	2025-08-21 01:05:15 +02:00
Robert Sachunsky	3d53070b90	avoid creating invalid polygons via rounding	2025-08-21 01:03:46 +02:00
Robert Sachunsky	277d00579e	get_textregion_contours_in_org_image_light: no back rotation, drop slope_first (always 0)	2025-08-20 14:28:14 +02:00
Robert Sachunsky	b6d1c43a85	dilate_textregions_contours_textline_version: simplify (via shapely's Polygon.buffer()), ensure validity	2025-08-20 14:26:14 +02:00
Robert Sachunsky	6c442c9ae9	separate_lines/do_work_of_slopes: skip if crop is empty	2025-08-19 22:56:36 +02:00
Robert Sachunsky	e9a6ff5d81	return_boxes_of_images_by_order_of_reading_new: simplify, avoid changing dtype during np.append	2025-08-19 20:09:09 +02:00
Robert Sachunsky	f994ea5f0b	dilate_textregions_contours: simplify (via shapely's Polygon.buffer()), ensure validity	2025-08-19 11:59:26 +02:00
vahidrezanezhad	8dc2fab9fa	reading order on given layout	2025-08-18 02:31:13 +02:00
Clemens Neudecker	a2359ea4c4	Merge pull request #171 from bertsky/ocrd-machine-based-ro OCR-D processor: expose reading_order_machine_based	2025-08-15 18:40:13 +02:00
Robert Sachunsky	21615a986d	OCR-D processor: expose reading_order_machine_based	2025-08-13 14:14:37 +02:00
vahidrezanezhad	20614d1678	avoiding float in range	2025-08-12 12:50:15 +02:00
vahidrezanezhad	5db3e9fa64	deskewing with faster multiprocessing	2025-08-08 11:32:02 +02:00
vahidrezanezhad	a0c19c57be	use the latest ocr model with balanced fraktur-antiqua training dataset	2025-08-05 14:22:22 +02:00
vahidrezanezhad	0803881f36	threshold for textline ocr + new ocr model	2025-07-25 13:18:38 +02:00
vahidrezanezhad	6b8893b188	Merge pull request #167 from qurator-spk/ocrd-fixes Ocrd fixes	2025-07-22 14:46:25 +02:00
vahidrezanezhad	d968a306e4	should merged text for the whole page be written in xml?	2025-07-21 14:50:05 +02:00
vahidrezanezhad	920705c3b1	update model names	2025-07-21 10:54:20 +02:00
vahidrezanezhad	e0f4a007e4	ocr model renamed - image text font for ocr result is now using Charis-7.000 font (downloaded from here https://software.sil.org/charis/download/)	2025-07-16 14:00:12 +02:00
vahidrezanezhad	e54ebaa23e	ocr: make sure that image height or width is not zero	2025-07-03 15:24:52 +02:00
vahidrezanezhad	59ea493803	decorated with confidence value for cnnrnn ocr model	2025-07-03 11:50:47 +02:00
kba	b7b218ff11	OCR-D processor: same behavior as standalone wrt light_version/textline_light	2025-06-12 15:30:17 +02:00
vahidrezanezhad	c194a20c9c	Fixed duplicate textline_light assignments (true and false) in the OCR-D framework for the Eynollah light version, which caused rectangles to be used instead of contours for textlines	2025-06-12 15:27:22 +02:00
vahidrezanezhad	065f1f9a93	Fix: Resolved OCR bug when text region type is undefined	2025-06-02 18:21:33 +02:00
vahidrezanezhad	7996afac69	image enhancer updated	2025-06-01 22:44:50 +02:00
vahidrezanezhad	d14bd162ca	saving enhanced image in org or scaled resolution	2025-06-01 22:10:13 +02:00
vahidrezanezhad	cc36694dfd	image enhancer is integrated	2025-06-01 15:53:04 +02:00

1 2 3 4 5 ...

813 commits