You cannot select more than 25 topics Topics must start with a letter or number, can include dashes ('-') and can be up to 35 characters long.

2.2 KiB

User Guide

version 0.1

1. Introduction

2. User Guide

Technical Requirements

ner.edith runs locally as a pure HTML+JavaScript webpage in your web browser. No software needs to be installed, but JavaScript has to be enabled in the browser. Any fairly recent browser should work, but only Chrome and Firefox are tested.

Data input format

Input data is required to follow the format used in the GermEval2014 Named Entity Recognition Shared Task . Here, text is encoded as one token per line, with information provided in tab-separated columns. The first column contains either a #, which signals the source the sentence is cited from and the date it was retrieved, or the token number within the sentence. The second column contains the token. Name spans are encoded in the BIO-scheme. Outer spans are encoded in the third column, embedded spans in the fourth column.

Data preparation

We also provide some Python tools that help with data wrangling.

Overview of Editor Features

  • Navigation
    • use mouse wheel to scroll up and down
    • use navigation << and >> to move faster
    • show image snippet
  • Tagging
    • adding a tag
    • removing a tag
    • changing a tag
  • OCR correction
    • editing the token text
  • Segmentation correction
    • merging two tokens
    • splitting a token

Data export/Saving progress

The editor runs fully locally in the browser. Therefore it can not automatically save any changes you made to disk. You have to use the Save Changes button in order to so manually from time to time.

If your browser automatically saves all downloads to your Downloads folder, you might want to configure it so that it instead prompts you where to save.

Configuration option in Firefox:

Screenshot

Configuration option in Chrome:

Screenshot

3. FAQ