neat/tools
cneud d4199d0ddd add Python script for converting PAGE-XML to TSV
..
README.md add tools README
cli.py add image preview
page2tsv.py add Python script for converting PAGE-XML to TSV
requirements.txt add annotation tools and url mapping integration
setup.py add annotation tools and url mapping integration

README.md

TSV - Processing Tools

Installation:

Setup virtual environment:

virtualenv --python=python3.6 venv

Activate virtual environment:

source venv/bin/activate

Upgrade pip:

pip install -U pip

Install package together with its dependencies in development mode:

pip install -e ./

Usage:

Create a URL-annotated TSV file from an existing TSV file:

annotate-tsv enp_DE.tsv enp_DE-annotated.tsv

Create a corresponding URL-mapping file:

extract-doc-links enp_DE.tsv  enp_DE-urls.tsv

By loading the annotated TSV as well as the url mapping file into ner.edith, you will be able to jump directly to the original image where the full text has been extracted from.