mirror of
				https://github.com/mikegerber/ocrd_calamari.git
				synced 2025-10-30 23:34:13 +01:00 
			
		
		
		
	📝 Reduce process() docstring again
This commit is contained in:
		
							parent
							
								
									b4c3b026db
								
							
						
					
					
						commit
						34013ddb02
					
				
					 1 changed files with 3 additions and 23 deletions
				
			
		|  | @ -69,29 +69,9 @@ class CalamariRecognize(Processor): | |||
|         """ | ||||
|         Perform text recognition with Calamari on the workspace. | ||||
| 
 | ||||
|         For each page of the input file group, open and deserialize input PAGE-XML | ||||
|         and its respective images. Then iterate over the element hierarchy down to | ||||
|         the line level. | ||||
| 
 | ||||
|         For each textline, retrieve a segment image according to the layout annotation | ||||
|         (from an existing ``AlternativeImage``, or by cropping into the higher-level | ||||
|         images, and deskewing when applicable). | ||||
| 
 | ||||
|         If the line element contained any previous text results or word segmentation, | ||||
|         delete it. | ||||
| 
 | ||||
|         Convert the line image to a Numpy array and pass it to the recognizer. Aggregate | ||||
|         character results on the line level, stripping leading and trailing white space, | ||||
|         and selecting the best hypothesis for each position. Annotate the resulting | ||||
|         TextEquiv string and (average) confidence on the line segment. | ||||
| 
 | ||||
|         If ``texequiv_level`` is ``word`` or ``glyph``, then additionally create word | ||||
|         level segments by splitting at white space characters, using the vertical | ||||
|         line coordinates and horizontal white space boundaries. In the case of ``glyph``, | ||||
|         create glyph level segments as well, adding all alternative character hypotheses | ||||
|         down to ``glyph_conf_cutoff`` confidence threshold. | ||||
| 
 | ||||
|         Produce a new PAGE output file by serialising the resulting hierarchy. | ||||
|         If ``texequiv_level`` is ``word`` or ``glyph``, then additionally create word / glyph level segments by | ||||
|         splitting at white space characters / glyph boundaries. In the case of ``glyph``, add all alternative character | ||||
|         hypotheses down to ``glyph_conf_cutoff`` confidence threshold. | ||||
|         """ | ||||
|         log = getLogger('processor.CalamariRecognize') | ||||
| 
 | ||||
|  |  | |||
		Loading…
	
	Add table
		Add a link
		
	
		Reference in a new issue