diff --git a/docs/Preprocessing.md b/docs/Preprocessing.md index 8a7dc9c..0f38bf0 100644 --- a/docs/Preprocessing.md +++ b/docs/Preprocessing.md @@ -4,7 +4,6 @@ The preprocessing pipeline that is developed at the [Berlin State Library](http://staatsbibliothek-berlin.de/) comprises the following steps: - textline extraction @[sbb_pixelwise_segmentation](https://github.com/qurator-spk/pixelwise_segmentation_SBB) -- word segmentation @[ocrd_tesserocr](https://github.com/OCR-D/ocrd_tesserocr) -- OCR @[ocrd_calamari](https://github.com/qurator-spk/ocrd_calamari) +- OCR + word segmentation @[ocrd_tesserocr](https://github.com/OCR-D/ocrd_tesserocr) - Tokenization -- Pretagging @[sbb_ner](https://github.com/qurator-spk/sbb_ner) \ No newline at end of file +- Pretagging @[sbb_ner](https://github.com/qurator-spk/sbb_ner)