From 21c470edbf282ce7e8b6dce499b7813f4341802c Mon Sep 17 00:00:00 2001
From: Clemens Neudecker <952378+cneud@users.noreply.github.com>
Date: Tue, 17 Mar 2020 15:55:24 +0100
Subject: [PATCH] Update README.md

---
 README.md | 45 +++++++++++++++++----------------------------
 1 file changed, 17 insertions(+), 28 deletions(-)

diff --git a/README.md b/README.md
index 7ee9eaa..42393e4 100644
--- a/README.md
+++ b/README.md
@@ -8,39 +8,32 @@
 
 [2. User Guide](https://github.com/qurator-spk/neat/blob/master/README.md#2-user-guide)
 
-&nbsp;&nbsp;&nbsp;[2.1 Technical requirements](https://github.com/qurator-spk/neat/blob/master/README.md#21-technical-requirements) 
-
-&nbsp;&nbsp;&nbsp;[2.2 Installation](https://github.com/qurator-spk/neat/blob/master/README.md#22-installation) 
-    
-&nbsp;&nbsp;&nbsp;[2.3 Data format](https://github.com/qurator-spk/neat/blob/master/README.md#23-data-format)
-    
-&nbsp;&nbsp;&nbsp;[2.4 Data preparation](https://github.com/qurator-spk/neat/blob/master/README.md#24-data-preparation)
+&nbsp;&nbsp;&nbsp;[2.1 Installation](https://github.com/qurator-spk/neat/blob/master/README.md#22-installation) 
     
-&nbsp;&nbsp;&nbsp;[2.5 Keyboard navigation](https://github.com/qurator-spk/neat/blob/master/README.md#26-keyboard-navigation)
+&nbsp;&nbsp;&nbsp;[2.2 Data format](https://github.com/qurator-spk/neat/blob/master/README.md#23-data-format)
     
-&nbsp;&nbsp;&nbsp;[2.6 Mouse navigation](https://github.com/qurator-spk/neat/blob/master/README.md#27-mouse-navigation)
+&nbsp;&nbsp;&nbsp;[2.3 Navigation](https://github.com/qurator-spk/neat/blob/master/README.md#26-keyboard-navigation)
     
-&nbsp;&nbsp;&nbsp;[2.7 Image support](https://github.com/qurator-spk/neat/blob/master/README.md#28-image-support)
-    
-&nbsp;&nbsp;&nbsp;[2.8 Saving progress](https://github.com/qurator-spk/neat/blob/master/README.md#29-saving-progress)
+&nbsp;&nbsp;&nbsp;[2.4 Saving progress](https://github.com/qurator-spk/neat/blob/master/README.md#29-saving-progress)
 
 [3. Annotation Guidelines](https://github.com/qurator-spk/neat/blob/master/README.md#3-annotation-guidelines)
 
 ### 1. Introduction
-[neat](https://github.com/qurator-spk/neat) is a simple, browser-based tool for editing and annotating text with named entities to produce a dataset for training/testing/evaluation. It can be used to add or correct named entity BIO-tags in a TSV file and to correct the token text or tokenization (e.g. due to OCR/segmentation errors). 
+[neat](https://github.com/qurator-spk/neat) is a simple, browser-based tool for editing and annotating text with named entities to produce labeled data for training/testing/evaluation. It can be used to add or correct named entity labels in a TSV file and to correct the token text or tokenization (e.g. due to OCR/segmentation errors). 
 
 [neat](https://github.com/qurator-spk/neat) is developed at the [Berlin State Library](https://staatsbibliothek-berlin.de/) for data annotation in the [SoNAR-IDH](https://sonar.fh-potsdam.de/) project and the [QURATOR](https://qurator.ai/) project.
 
 ### 2. User Guide
 
-#### 2.1 Technical Requirements 
-[neat](https://github.com/qurator-spk/neat) runs locally as a pure HTML+JavaScript webpage in your web browser. No software needs to be installed, but JavaScript has to be enabled in the browser. 
+#### 2.1 Installation
+[neat](https://github.com/qurator-spk/neat) runs locally as a pure HTML+JavaScript webpage in your web browser. No additional software needs to be installed, but JavaScript has to be enabled in the browser.
+
+Clone the repo using ``git clone https://github.com/qurator-spk/neat.git`` or download and extract the [ZIP](https://github.com/qurator-spk/neat/archive/master.zip). Make sure you have ``neat.html`` and ``neat.js`` in the same directory and open ``neat.html`` in a browser. Any fairly recent browser should work, but only Chrome and Firefox are tested.
 
-#### 2.2. Installation
-Clone the repo using ``git clone https://github.com/qurator-spk/neat.git`` or download the [ZIP](https://github.com/qurator-spk/neat/archive/master.zip). Make sure you have ``neat.html`` and ``neat.js`` in the same directory and open ``neat.html`` in a browser. Any fairly recent browser should work, but only Chrome and Firefox are tested.
+#### 2.2 Data format
+The source data used for annotation in the [SoNAR-IDH](https://sonar.fh-potsdam.de/) project and the [QURATOR](https://qurator.ai/) project are OCR results in [PAGE-XML](https://github.com/PRImA-Research-Lab/PAGE-XML) format. We provide a [Python tool](https://github.com/qurator-spk/page2tsv) for the transformation of OCR files in [PAGE-XML](https://github.com/PRImA-Research-Lab/PAGE-XML) into the [TSV format](https://github.com/qurator-spk/neat/blob/master/README.md#23-data-format) used by [neat](https://github.com/qurator-spk/neat).
 
-#### 2.3 Data format   
-The data format is based on the format used in the [GermEval2014 Named Entity Recognition Shared Task](https://sites.google.com/site/germeval2014ner/data). Text is encoded as one token per line, with name spans encoded in the BIO-scheme, provided as tab-separated values:
+The internal data format used by [neat](https://github.com/qurator-spk/neat) is based on the format used in the [GermEval2014 Named Entity Recognition Shared Task](https://sites.google.com/site/germeval2014ner/data). Text is encoded as one token per line, with name spans in the [IOB2](https://en.wikipedia.org/wiki/Inside%E2%80%93outside%E2%80%93beginning_(tagging)) format as tab-separated values:
 * the first column contains either a `#`, which signals the source the sentence is cited from, or 
 * the token position within the sentence ``>=1``
 * sentence boundaries are indicated by ``0``
@@ -74,7 +67,7 @@ No.	TOKEN	NE-TAG	NE-EMB
 
 For our purposes we extend this format by adding
 * a fifth column for an ``ID`` for the outer ``NE-TAG`` from an authority file
-* column six for use as a variable ``url_id`` (see [Image Support](https://github.com/qurator-spk/neat/blob/master/README.md#27-image-support) for further details)
+* column six for use as a variable ``url_id`` for [iiif](https://iiif.io/) Image API support ([neat](https://github.com/qurator-spk/neat) supports the embedding of image snippets to assist data annotation and correction if the the input PAGE-XML contains word bounding boxes)
 * finally, columns 7+ are used for storing ``left,right,top,bottom`` pixel coordinates for image snippets 
 
 Example (full):
@@ -101,10 +94,9 @@ No.	TOKEN	NE-TAG	NE-EMB	ID	url_id	left,right,top,bottom
 2	3	O	O	-	0	1939,1967,364,385
 ```
 
-#### 2.4 Data preparation  
-The source data that is used for annotation are OCR results in [PAGE-XML](https://github.com/PRImA-Research-Lab/PAGE-XML) format. We provide a [Python tool](https://github.com/qurator-spk/page2tsv) for the transformation of [PAGE-XML](https://github.com/PRImA-Research-Lab/PAGE-XML) OCR files into the [TSV format](https://github.com/qurator-spk/neat/blob/master/README.md#23-data-format) used by [neat](https://github.com/qurator-spk/neat).
+#### 2.3 Navigation
 
-#### 2.5 Keyboard-Navigation
+##### Keyboard
 | Key Combination|      Action      |
 |:---------|:-------------------------------------------|
 | Left     |  Move one cell left                        |
@@ -145,7 +137,7 @@ The source data that is used for annotation are OCR results in [PAGE-XML](https:
 | l r      | remove on display row (minimum is 5)       |
 |----------|--------------------------------------------|
 
-#### 2.6 Mouse-Navigation
+##### Mouse
 * use mouse wheel to scroll up and down
 
 * left-click `<<` and `>>` to move 15 rows up or down
@@ -162,10 +154,7 @@ The source data that is used for annotation are OCR results in [PAGE-XML](https:
 
 * left-click the `POSITION` of a row and select `start-sentence` from the drop-down menu to start a new sentence
 
-#### 2.7 Image Support
-Provided facsimile images are available via the [iiif.io](https://iiif.io/) Image API, [neat](https://github.com/qurator-spk/neat) supports the embedding of image snippets into its interface to assist data annotation and correction. This requires that the PAGE-XML OCR contains word bounding boxes. 
-
-#### 2.8 Saving progress
+#### 2.4 Saving progress
 [neat](https://github.com/qurator-spk/neat) runs fully locally in the browser. Therefore it can not automatically save any changes you made to disk. You have to use the `Save Changes` button to do so manually from time to time.
 
 ### 3. Annotation Guidelines