Update Preprocessing.md

2026-02-04 16:41:54 +01:00 · 2019-11-19 23:50:52 +01:00 · 2019-11-19 23:50:52 +01:00 · 5f6b8bc9c3
commit 5f6b8bc9c3
parent 564a9ee851
1 changed files with 11 additions and 3 deletions
--- a/docs/Preprocessing.md
+++ b/docs/Preprocessing.md
@ -3,7 +3,15 @@
 The preprocessing pipeline that is developed at the 
 [Berlin State Library](http://staatsbibliothek-berlin.de/) 
 comprises the following steps:
- textline extraction @[sbb_pixelwise_segmentation](https://github.com/qurator-spk/pixelwise_segmentation_SBB)
+- Layout Analysis & Textline Extraction @[sbb_pixelwise_segmentation](https://github.com/qurator-spk/pixelwise_segmentation_SBB)
- OCR + word segmentation @[ocrd_tesserocr](https://github.com/OCR-D/ocrd_tesserocr)
+- OCR & Word Segmentation @[ocrd_tesserocr](https://github.com/OCR-D/ocrd_tesserocr)
 - Tokenization
- Pretagging @[sbb_ner](https://github.com/qurator-spk/sbb_ner)
+- Named Entity Recognition @[sbb_ner](https://github.com/qurator-spk/sbb_ner)
 ### Layout Analysis & Textline Extraction
 ### OCR & Word Segmentation
 ### Tokenization
 ### Named Entity Recognition