From 01c080387e7b135a69350b249234a15bd8b2d9c6 Mon Sep 17 00:00:00 2001 From: Kai Labusch Date: Wed, 2 Oct 2019 15:01:15 +0200 Subject: [PATCH] add tools README --- tools/README.md | 40 ++++++++++++++++++++++++++++++++++++++++ 1 file changed, 40 insertions(+) diff --git a/tools/README.md b/tools/README.md index e69de29..8b5840e 100644 --- a/tools/README.md +++ b/tools/README.md @@ -0,0 +1,40 @@ +# TSV - Processing Tools + +## Installation: + +Setup virtual environment: +``` +virtualenv --python=python3.6 venv +``` + +Activate virtual environment: +``` +source venv/bin/activate +``` + +Upgrade pip: +``` +pip install -U pip +``` + +Install package together with its dependencies in development mode: +``` +pip install -e ./ +``` + +## Usage: + +Create a URL-annotated TSV file from an existing TSV file: + +``` +annotate-tsv enp_DE.tsv enp_DE-annotated.tsv +``` +Create a corresponding URL-mapping file: + +``` +extract-doc-links enp_DE.tsv enp_DE-urls.tsv +``` + +By loading the annotated TSV as well as the url mapping file into +ner.edith, you will be able to jump directly to the original image +where the full text has been extracted from. \ No newline at end of file