mirror of
https://github.com/qurator-spk/neat.git
synced 2025-06-11 04:39:54 +02:00
Update README.md
This commit is contained in:
parent
f60b0d4e93
commit
f2e9ed535d
1 changed files with 11 additions and 1 deletions
|
@ -37,4 +37,14 @@ extract-doc-links enp_DE.tsv enp_DE-urls.tsv
|
|||
|
||||
By loading the annotated TSV as well as the url mapping file into
|
||||
ner.edith, you will be able to jump directly to the original image
|
||||
where the full text has been extracted from.
|
||||
where the full text has been extracted from.
|
||||
|
||||
# PAGE-XML to TSV Transformation
|
||||
|
||||
## Usage:
|
||||
|
||||
Create a TSV file from OCR in PAGE-XML format (with word segmentation):
|
||||
|
||||
```
|
||||
python page2tsv.py PAGE.xml > PAGE.tsv
|
||||
```
|
||||
|
|
Loading…
Add table
Add a link
Reference in a new issue