Update README.md

pull/39/head
Clemens Neudecker 5 years ago committed by GitHub
parent f60b0d4e93
commit f2e9ed535d
No known key found for this signature in database
GPG Key ID: 4AEE18F83AFDEB23

@ -37,4 +37,14 @@ extract-doc-links enp_DE.tsv enp_DE-urls.tsv
By loading the annotated TSV as well as the url mapping file into
ner.edith, you will be able to jump directly to the original image
where the full text has been extracted from.
where the full text has been extracted from.
# PAGE-XML to TSV Transformation
## Usage:
Create a TSV file from OCR in PAGE-XML format (with word segmentation):
```
python page2tsv.py PAGE.xml > PAGE.tsv
```

Loading…
Cancel
Save