| 
								
								
									 Robert Sachunsky | 90f1d7aa47 | rm summary msg (info already logged elsewhere) | 2025-09-24 12:10:11 +02:00 |  | 
				
					
						| 
								
								
									 Robert Sachunsky | 7933b103f5 | log modes only once (in run, not in run_single) | 2025-09-24 12:09:30 +02:00 |  | 
				
					
						| 
								
								
									 Robert Sachunsky | d0817f5744 | fix typo | 2025-09-24 12:08:50 +02:00 |  | 
				
					
						| 
								
								
									 kba | 9ead58b99a | Merge remote-tracking branch 'michalbubula/add-feedback' into prepare-release-v0.5.0 | 2025-09-23 19:50:27 +02:00 |  | 
				
					
						| 
								
								
									 kba | 7bde99e866 | Merge remote-tracking branch 'origin/updating_readme_for_eynollah_use_cases' into prepare-release-v0.5.0 | 2025-09-23 19:42:55 +02:00 |  | 
				
					
						| 
								
								
									 kba | df8d93dbfa | Merge branch 'main' into add-feedback | 2025-09-23 19:20:20 +02:00 |  | 
				
					
						| 
								
								
									 vahidrezanezhad | a65405bead | tables are visulaized within layout | 2025-09-22 15:56:14 +02:00 |  | 
				
					
						| 
								
								
									 vahidrezanezhad | 554f3988c9 | default cnn-rnn and transformer ocr models have changed to model_eynollah_ocr_cnnrnn_20250904 and model_eynollah_ocr_trocr_20250919 respectively | 2025-09-21 16:33:14 +02:00 |  | 
				
					
						| 
								
								
									 vahidrezanezhad | 6bbdfe1074 | extending image types | 2025-09-21 02:32:40 +02:00 |  | 
				
					
						| 
								
								
									 vahidrezanezhad | e97e3ab192 | Merge text of textlines and handle hyphenated words by joining them correctly | 2025-09-19 23:23:30 +02:00 |  | 
				
					
						| 
								
								
									 vahidrezanezhad | b38331b4ab | writing page contour correctly in xml output + ignore unsupported file types when loading images | 2025-09-19 18:06:18 +02:00 |  | 
				
					
						| 
								
								
									 vahidrezanezhad | 994bc8a1c0 | debug new page extraction in the case of ignoring page extraction | 2025-09-19 15:24:34 +02:00 |  | 
				
					
						| 
								
								
									 vahidrezanezhad | 530897c6c2 | renaming argument names | 2025-09-19 13:20:26 +02:00 |  | 
				
					
						| 
								
								
									 kba | 5c9cf8472b | remove redundant/brittle interval logging | 2025-09-18 13:19:57 +02:00 |  | 
				
					
						| 
								
								
									 kba | 146102842a | convert all print stmts to logger.info calls | 2025-09-18 13:15:18 +02:00 |  | 
				
					
						| 
								
								
									 kba | c64d102613 | move logging to CLI and make initialization optional | 2025-09-18 13:07:41 +02:00 |  | 
				
					
						| 
								
								
									 vahidrezanezhad | 310679eeb8 | page extraction model name is changed | 2025-09-16 14:27:15 +02:00 |  | 
				
					
						| 
								
								
									 vahidrezanezhad | 542646791d | For TrOCR, the cropped text lines will no longer be added to a list before prediction. Instead, for each batch size, the text line images will be collected and predictions will be made directly on them. | 2025-09-23 19:03:13 +02:00 |  | 
				
					
						| 
								
								
									 vahidrezanezhad | 68a71be8bc | Running inference on files in a directory | 2025-09-13 22:40:11 +02:00 |  | 
				
					
						| 
								
								
									 vahidrezanezhad | 0711166524 | changed the drop capitals bonding box to contour ratio threshold | 2025-09-01 11:37:22 +02:00 |  | 
				
					
						| 
								
								
									 vahidrezanezhad | e15640aa8a | new page extraction model integration | 2025-09-15 13:36:58 +02:00 |  | 
				
					
						| 
								
								
									 vahidrezanezhad | 6a735daa60 | Update README.md | 2025-08-31 23:30:54 +02:00 |  | 
				
					
						| 
								
								
									 vahidrezanezhad | 9b9d21d8ac | eynollah ocr: support using either a specific model name or a models directory (default model) | 2025-08-28 11:30:59 +02:00 |  | 
				
					
						| 
								
								
									 vahidrezanezhad | 41365645ef | Marginals are divided into left and right, and written from top to bottom. | 2025-08-26 22:38:03 +02:00 |  | 
				
					
						| 
								
								
									 vahidrezanezhad | 7741502876 | reading order on given layout | 2025-08-18 02:31:13 +02:00 |  | 
				
					
						| 
								
								
									 Clemens Neudecker | a2359ea4c4 | Merge pull request #171 from bertsky/ocrd-machine-based-ro OCR-D processor: expose reading_order_machine_based | 2025-08-15 18:40:13 +02:00 |  | 
				
					
						| 
								
								
									 Robert Sachunsky | 21615a986d | OCR-D processor: expose reading_order_machine_based | 2025-08-13 14:14:37 +02:00 |  | 
				
					
						| 
								
								
									 michalbubula | 8ebba5ac04 | add feedback to command line interface | 2025-08-12 16:21:15 +02:00 |  | 
				
					
						| 
								
								
									 vahidrezanezhad | 268aa141d7 | avoiding float in range | 2025-08-12 12:50:15 +02:00 |  | 
				
					
						| 
								
								
									 vahidrezanezhad | cf4983da54 | visualize vertical ocr text vertically | 2025-08-08 16:12:55 +02:00 |  | 
				
					
						| 
								
								
									 vahidrezanezhad | 52d9cc9baf | deskewing with faster multiprocessing | 2025-08-08 11:32:02 +02:00 |  | 
				
					
						| 
								
								
									 vahidrezanezhad | 263da755ef | loading xmls with UTF-8 encoding | 2025-08-07 10:32:49 +02:00 |  | 
				
					
						| 
								
								
									 vahidrezanezhad | 6462ea5b33 | adding visualization of ocr text of xml file | 2025-08-06 22:33:42 +02:00 |  | 
				
					
						| 
								
								
									 vahidrezanezhad | 322b04145f | use the latest ocr model with balanced fraktur-antiqua training dataset | 2025-08-05 14:22:22 +02:00 |  | 
				
					
						| 
								
								
									 vahidrezanezhad | 1b95f8f38d | threshold for textline ocr + new ocr model | 2025-07-25 13:18:38 +02:00 |  | 
				
					
						| 
								
								
									 Clemens Neudecker | 2996fc8b30 | Merge pull request #166 from qurator-spk/updating_readme_for_eynollah_use_cases-cli Updating readme for eynollah use cases cli | 2025-07-24 15:30:57 +02:00 |  | 
				
					
						| 
								
								
									 vahidrezanezhad | fd0595f920 | Update Makefile | 2025-07-24 13:52:38 +02:00 |  | 
				
					
						| 
								
								
									 vahidrezanezhad | da141bb42e | resolving tests error | 2025-07-23 16:44:17 +02:00 |  | 
				
					
						| 
								
								
									 vahidrezanezhad | 6b8893b188 | Merge pull request #167 from qurator-spk/ocrd-fixes Ocrd fixes | 2025-07-22 14:46:25 +02:00 |  | 
				
					
						| 
								
								
									 vahidrezanezhad | daa597dbaa | should merged text for the whole page be written in xml? | 2025-07-21 14:50:05 +02:00 |  | 
				
					
						| 
								
								
									 vahidrezanezhad | 673e67a847 | update model names | 2025-07-21 10:54:20 +02:00 |  | 
				
					
						| 
								
								
									 vahidrezanezhad | fee40049cd | ocr model renamed - image text font for ocr result is now using Charis-7.000 font (downloaded from here https://software.sil.org/charis/download/) | 2025-07-16 14:00:12 +02:00 |  | 
				
					
						| 
								
								
									 vahidrezanezhad | 04fead348f | ocr: make sure that image height or width is not zero | 2025-07-03 15:24:52 +02:00 |  | 
				
					
						| 
								
								
									 vahidrezanezhad | 53dd4b26a9 | decorated with confidence value for cnnrnn ocr model | 2025-07-03 11:50:47 +02:00 |  | 
				
					
						| 
								
								
									 vahidrezanezhad | 1b222594d6 | Update README.md: how to train model using docker image | 2025-06-25 18:33:55 +02:00 |  | 
				
					
						| 
								
								
									 vahidrezanezhad | f5a1d1a255 | docker file to train model with desired cuda and cudnn | 2025-06-25 18:24:16 +02:00 |  | 
				
					
						| 
								
								
									 kba | b7b218ff11 | OCR-D processor: same behavior as standalone wrt light_version/textline_light | 2025-06-12 15:30:17 +02:00 |  | 
				
					
						| 
								
								
									 vahidrezanezhad | c194a20c9c | Fixed duplicate textline_light assignments (true and false) in the OCR-D framework for the Eynollah light version, which caused rectangles to be used instead of contours for textlines | 2025-06-12 15:27:22 +02:00 |  | 
				
					
						| 
								
								
									 kba | 32889ef1e0 | adapt binarization CLI according to #156 | 2025-06-12 13:57:41 +02:00 |  | 
				
					
						| 
								
								
									 vahidrezanezhad | 9b4e78c55c | Fixed duplicate textline_light assignments (true and false) in the OCR-D framework for the Eynollah light version, which caused rectangles to be used instead of contours for textlines | 2025-06-11 18:57:08 +02:00 |  |