mirror of
https://github.com/qurator-spk/neat.git
synced 2025-06-12 21:29:54 +02:00
downloaded old neath
This commit is contained in:
parent
1fe2479b6f
commit
62f2f4963b
5 changed files with 1145 additions and 264 deletions
BIN
Annotation_Guidelines.pdf
Normal file
BIN
Annotation_Guidelines.pdf
Normal file
Binary file not shown.
195
README.md
195
README.md
|
@ -1,5 +1,192 @@
|
|||
# neath: named entity annotation tool in html
|
||||
[User Guide](docs/User_Guide.md) | [Anntotation Guidelines](docs/Annotation_Guidelines.md)
|
||||
|
||||
# neath: named entity annotation tool
|
||||
#### version 0.1
|
||||
---
|
||||

|
||||

|
||||
---
|
||||
|
||||
### Table of contents
|
||||
[1. Introduction](https://github.com/qurator-spk/neath/blob/master/README.md#1-introduction)
|
||||
|
||||
[2. User Guide](https://github.com/qurator-spk/neath/blob/master/README.md#2-user-guide)
|
||||
|
||||
[2.1 Technical requirements](https://github.com/qurator-spk/neath/blob/master/README.md#21-technical-requirements)
|
||||
|
||||
[2.2 Installation](https://github.com/qurator-spk/neath/blob/master/README.md#22-installation)
|
||||
|
||||
[2.3 Data format](https://github.com/qurator-spk/neath/blob/master/README.md#23-data-format)
|
||||
|
||||
[2.4 Data preparation](https://github.com/qurator-spk/neath/blob/master/README.md#24-data-preparation)
|
||||
|
||||
[2.5 Provenance](https://github.com/qurator-spk/neath/blob/master/README.md#25-provenance)
|
||||
|
||||
[2.6 Keyboard navigation](https://github.com/qurator-spk/neath/blob/master/README.md#26-keyboard-navigation)
|
||||
|
||||
[2.7 Mouse navigation](https://github.com/qurator-spk/neath/blob/master/README.md#27-mouse-navigation)
|
||||
|
||||
[2.8 Image support](https://github.com/qurator-spk/neath/blob/master/README.md#28-image-support)
|
||||
|
||||
[2.9 Saving progress](https://github.com/qurator-spk/neath/blob/master/README.md#29-saving-progress)
|
||||
|
||||
[3. Annotation Guidelines](https://github.com/qurator-spk/neath/blob/master/README.md#3-annotation-guidelines)
|
||||
|
||||
### 1. Introduction
|
||||
[neath](https://github.com/qurator-spk/neath) is a simple, browser-based tool for editing and annotating text with named entities to produce a corpus for training/testing/evaluation. It can be used to add or correct named entity BIO-tags in a TSV file and to correct the token text or tokenization (e.g. due to OCR/segmentation errors).
|
||||
|
||||
[neath](https://github.com/qurator-spk/neath) is developed at the [Berlin State Library](https://staatsbibliothek-berlin.de/) for data annotation in the context of the [SoNAR-IDH](https://sonar.fh-potsdam.de/) project and the [QURATOR](https://qurator.ai/) project.
|
||||
|
||||
### 2. User Guide
|
||||
|
||||
#### 2.1 Technical Requirements
|
||||
[neath](https://github.com/qurator-spk/neath) runs locally as a pure HTML+JavaScript webpage in your web browser. No software needs to be installed, but JavaScript has to be enabled in the browser.
|
||||
|
||||
#### 2.2. Installation
|
||||
Simply clone the repo using ``git clone https://github.com/qurator-spk/neath.git`` or download the [ZIP](https://github.com/qurator-spk/neath/archive/master.zip). Make sure you have at minimum ``neath.html`` and ``neath.js`` residing in a local directory, then it is sufficient to just open ``neath.html`` in a browser. Any fairly recent browser should work, but only Chrome and Firefox are tested.
|
||||
|
||||
#### 2.3 Data format
|
||||
The data format is based on the format used in the [GermEval2014 Named Entity Recognition Shared Task](https://sites.google.com/site/germeval2014ner/data). Text is encoded as one token per line, with name spans encoded in the BIO-scheme, provided as tab-separated values:
|
||||
* the first column contains either a `#`, which signals the source the sentence is cited from, or
|
||||
* the token position within the sentence ``>=1``
|
||||
* sentence boundaries are indicated by ``0``
|
||||
* the second column contains the token ``text``
|
||||
* outer entity spans are encoded in the third column ``NE-TAG``
|
||||
* embedded entity spans are encoded in the fourth column ``NE-EMB``
|
||||
|
||||
Example (simple):
|
||||
```tsv
|
||||
No. TOKEN NE-TAG NE-EMB
|
||||
# https://example.url
|
||||
1 Donnerstag O O
|
||||
2 , O O
|
||||
3 1 O O
|
||||
4 . O O
|
||||
5 Januar O O
|
||||
6 . O O
|
||||
0 O O
|
||||
1 Berliner B-ORG B-LOC
|
||||
2 Tageblatt I-ORG O
|
||||
3 . O O
|
||||
0 O O
|
||||
1 Nr O O
|
||||
2 . O O
|
||||
3 1 O O
|
||||
4 . O O
|
||||
0 O O
|
||||
1 Seite O O
|
||||
2 3 O O
|
||||
```
|
||||
|
||||
For our purposes we extend this format by adding
|
||||
* a fifth column for an ``ID`` for the outer ``NE-TAG`` from an authority file (in this case, the [GND](https://www.dnb.de/EN/Professionell/Standardisierung/GND/gnd_node.html) is used)
|
||||
* column six for use as a variable ``url_id`` (see [Image Support](https://github.com/qurator-spk/neath/blob/master/README.md#28-image-support) for further details)
|
||||
* finally, columns 7+ are used for storing ``left,right,top,bottom`` pixel coordinates for facsimile snippets
|
||||
|
||||
Example (full):
|
||||
```tsv
|
||||
No. TOKEN NE-TAG NE-EMB GND-ID url_id left,right,top,bottom
|
||||
# https://example.url/iiif/left,right,top,bottom/full/0/default.jpg
|
||||
1 Donnerstag O O - 0 174,352,358,390
|
||||
2 , O O - 0 174,352,358,390
|
||||
3 1 O O - 0 367,392,361,381
|
||||
4 . O O - 0 370,397,352,379
|
||||
5 Januar O O - 0 406,518,358,386
|
||||
6 . O O - 0 406,518,358,386
|
||||
0
|
||||
1 Berliner B-ORG B-LOC 1086206452 0 816,984,358,388
|
||||
2 Tageblatt I-ORG O 1086206452 0 1005,1208,360,387
|
||||
3 . O O - 0 1005,1208,360,387
|
||||
0
|
||||
1 Nr O O - 0 1237,1288,360,382
|
||||
2 . O O - 0 1237,1288,360,382
|
||||
3 1 O O - 0 1304,1326,361,381
|
||||
4 . O O - 0 1304,1326,361,381
|
||||
0
|
||||
1 Seite O O - 0 1837,1926,361,392
|
||||
2 3 O O - 0 1939,1967,364,385
|
||||
```
|
||||
|
||||
#### 2.4 Data preparation
|
||||
The source data that is used for annotation are OCR results in [PAGE-XML](https://github.com/PRImA-Research-Lab/PAGE-XML) format. We provide a [Python tool](https://github.com/qurator-spk/page2tsv) that supports the transformation of [PAGE-XML](https://github.com/PRImA-Research-Lab/PAGE-XML) OCR files into the [TSV format](https://github.com/qurator-spk/neath/blob/master/README.md#23-data-format) required for use with [neath](https://github.com/qurator-spk/neath).
|
||||
|
||||
#### 2.5 Provenance
|
||||
The processing pipeline applied at the Berlin State Library comprises the follows steps:
|
||||
|
||||
1. Layout Analysis & Textline Extraction
|
||||
Layout Analysis & Textline Extraction @[sbb_textline_detector](https://github.com/qurator-spk/sbb_textline_detector)
|
||||
2. OCR & Word Segmentation
|
||||
OCR is based on [OCR-D](https://github.com/OCR-D)'s [ocrd_tesserocr](https://github.com/OCR-D/ocrd_tesserocr) which requires [Tesseract](https://github.com/tesseract-ocr/tesseract) **>= 4.1.0**. The [GT4HistOCR_2000000](https://ub-backup.bib.uni-mannheim.de/~stweil/ocrd-train/data/GT4HistOCR_2000000.traineddata) model, which is [trained](https://github.com/tesseract-ocr/tesstrain/wiki/GT4HistOCR) on the [GT4HistOCR](https://zenodo.org/record/1344132) corpus, is used. Further details are available in the [paper](https://arxiv.org/abs/1809.05501).
|
||||
3. TSV Transformation
|
||||
A simple [Python tool](https://github.com/qurator-spk/page2tsv) is used for the transformation of the OCR results in [PAGE-XML](https://github.com/PRImA-Research-Lab/PAGE-XML) to [TSV](https://github.com/qurator-spk/neath/blob/master/docs/README.md#23-data-format).
|
||||
4. Tokenization
|
||||
For tokenization, [SoMaJo](https://github.com/tsproisl/SoMaJo) is used.
|
||||
5. Named Entity Recognition
|
||||
For Named Entity Recognition, a [BERT-Base](https://github.com/google-research/bert) model was trained for noisy OCR texts with historical spelling variation. [sbb_ner](https://github.com/qurator-spk/sbb_ner) is using a combination of unsupervised training on a large (~2.3m pages) [corpus of German OCR](https://zenodo.org/record/3257041) in combination with supervised training on a small (47k tokens) [annotated corpus](https://github.com/EuropeanaNewspapers/ner-corpora/tree/master/enp_DE.sbb.bio). Further details are available in the [paper](https://corpora.linguistik.uni-erlangen.de/data/konvens/proceedings/papers/KONVENS2019_paper_4.pdf).
|
||||
|
||||
#### 2.6 Keyboard-Navigation
|
||||
| Key Combination| Action |
|
||||
|:---------|:-------------------------------------------|
|
||||
| Left | Move one cell left |
|
||||
| Right | Move one cell right |
|
||||
| Up | Move one row up |
|
||||
| Down | Move one row down |
|
||||
| PageDown | Move page down |
|
||||
| PageUp | Move page up |
|
||||
| Crtl+Up | Move entire table one row up |
|
||||
| Crtl+Down| Move entire table one row down |
|
||||
|----------|--------------------------------------------|
|
||||
| s t | Start new sentence in current row |
|
||||
| m e | Merge current row with row above |
|
||||
| s p | Create copy of current row |
|
||||
| d l | Delete current row |
|
||||
|----------|--------------------------------------------|
|
||||
| backspace| Set NE-TAG / NE-EMB to "O" |
|
||||
| b p | Set NE-TAG / NE-EMB to "B-PER" |
|
||||
| b l | Set NE-TAG / NE-EMB to "B-LOC" |
|
||||
| b o | Set NE-TAG / NE-EMB to "B-ORG" |
|
||||
| b w | Set NE-TAG / NE-EMB to "B-WORK" |
|
||||
| b c | Set NE-TAG / NE-EMB to "B-CONF" |
|
||||
| b e | Set NE-TAG / NE-EMB to "B-EVT" |
|
||||
| b t | Set NE-TAG / NE-EMB to "B-TODO" |
|
||||
| i p | Set NE-TAG / NE-EMB to "I-PER" |
|
||||
| i l | Set NE-TAG / NE-EMB to "I-LOC" |
|
||||
| i o | Set NE-TAG / NE-EMB to "I-ORG" |
|
||||
| i w | Set NE-TAG / NE-EMB to "I-WORK" |
|
||||
| i c | Set NE-TAG / NE-EMB to "I-CONF" |
|
||||
| i e | Set NE-TAG / NE-EMB to "I-EVT" |
|
||||
| i t | Set NE-TAG / NE-EMB to "I-TODO" |
|
||||
|----------|--------------------------------------------|
|
||||
| enter | Edit TOKEN or GND-ID |
|
||||
| esc | Close TOKEN or GND-ID edit field without |
|
||||
| | application of changes. |
|
||||
|----------|--------------------------------------------|
|
||||
| l a | add one display row |
|
||||
| l r | remove on display row (minimum is 5) |
|
||||
|----------|--------------------------------------------|
|
||||
|
||||
#### 2.7 Mouse-Navigation
|
||||
* use mouse wheel to scroll up and down
|
||||
|
||||
* left-click `<<` and `>>` to move 15 rows up or down
|
||||
|
||||
* left-click `O` in the `NE-TAG` or `NE-EMB` columns to open the drop-down menu and select any of the supported NE-Tags to tag a token or change an existing tag to another one
|
||||
|
||||
* left-click a tag in the `NE-TAG` or `NE-EMB` columns and subsequently select `O` to remove a wrong tag
|
||||
|
||||
* left-click a token in the `TOKEN` column to edit/correct the text content
|
||||
|
||||
* left-click the `POSITION` of a row and select `split` from the drop-down menu to create a copy of the current row
|
||||
|
||||
* left-click the `POSITION` of a row and select `merge` from the drop-down menu to merge the current row with the row above
|
||||
|
||||
* left-click the `POSITION` of a row and select `start-sentence` from the drop-down menu to start a new sentence
|
||||
|
||||
#### 2.8 Image Support
|
||||
Provided facsimile images are available online via the [iiif.io](https://iiif.io/) Image API, [neath](https://github.com/qurator-spk/neath) supports the embedding of facsimile snippets into its interface to help with data annotation and correction.
|
||||
This further requires that OCR with word segmentation is applied to the image to determine bounding boxes for tokens.
|
||||
|
||||
The iiif-image-url contained in the source ``#`` can then be used as a replacement for ``url_id`` in combination with the token bounding boxes as ``left,right,top,bottom`` to obtain the facsimile snippet url and display the image in the leftmost column. Clicking on the facsimile snippet opens up a new tab with a larger context.
|
||||
|
||||
#### 2.9 Saving progress
|
||||
[neath](https://github.com/qurator-spk/neath) runs fully locally in the browser. Therefore it can not automatically save any changes you made to disk. You have to use the `Save Changes` button in order to so manually from time to time. If your browser automatically saves all downloads to your `Downloads` folder, you might want to configure it so that it instead prompts you where to save.
|
||||
|
||||
### 3. Annotation Guidelines
|
||||
The most recent version of the [Annotation Guidelines](https://github.com/qurator-spk/neath/blob/master/Annotation_Guidelines.pdf) is included in this repository.
|
||||
|
|
351
example.tsv
Normal file
351
example.tsv
Normal file
|
@ -0,0 +1,351 @@
|
|||
No. TOKEN NE-TAG NE-EMB GND-ID url_id left right top bottom
|
||||
# https://content.staatsbibliothek-berlin.de/zefys/SNP27646518-18800101-0-3-0-0/left,top,width,height/full/0/default.jpg
|
||||
0 Kampf O O - 0 154 212 400 419
|
||||
0 , O O - 0 154 212 400 419
|
||||
0 deſſen O O - 0 221 264 400 419
|
||||
0 Ende O O - 0 274 313 401 417
|
||||
0 vielleicht O O - 0 324 388 399 418
|
||||
0 noch O O - 0 397 429 400 418
|
||||
0 heute O O - 0 439 478 400 418
|
||||
0 nicht O O - 0 487 523 399 417
|
||||
0 abzuſehen O O - 0 532 605 399 418
|
||||
0 wäre O O - 0 615 656 399 417
|
||||
0 , O O - 0 615 656 399 417
|
||||
0 wenn O O - 0 671 701 402 415
|
||||
0 nicht O O - 0 702 755 399 417
|
||||
0 Herr O O - 0 155 192 419 437
|
||||
0 Gambetta B-PER O 118716263 0 202 277 419 437
|
||||
0 als O O - 0 287 311 420 436
|
||||
0 deus O O - 0 320 357 419 434
|
||||
0 ex O O - 0 366 385 422 434
|
||||
0 machina O O - 0 395 451 419 434
|
||||
0 erſchienen O O - 0 452 543 417 436
|
||||
0 wäre O O - 0 553 594 417 437
|
||||
0 , O O - 0 553 594 417 437
|
||||
0 reſp O O - 0 608 642 418 437
|
||||
0 . O O - 0 608 642 418 437
|
||||
0 durch O O - 0 652 692 418 436
|
||||
0 perſön⸗ O O - 0 698 756 418 437
|
||||
0 liche O O - 0 156 188 437 457
|
||||
0 Intervention O O - 0 197 298 438 457
|
||||
0 bei O O - 0 309 330 438 453
|
||||
0 dem O O - 0 339 370 437 453
|
||||
0 Präſidenten O O - 0 379 468 437 457
|
||||
0 Grévy B-PER O 119064693 0 475 524 436 456
|
||||
0 einen O O - 0 534 572 437 453
|
||||
0 Ausgleich O O - 0 577 650 437 455
|
||||
0 herbeigeführt O O - 0 658 755 436 455
|
||||
0 hätte O O - 0 155 207 457 475
|
||||
0 . O O - 0 155 207 457 475
|
||||
0 O O - 0 216 239 457 474
|
||||
0 Es O O - 0 216 239 457 474
|
||||
0 ſcheint O O - 0 252 300 457 475
|
||||
0 dem O O - 0 309 339 457 472
|
||||
0 Kammerpräſidenten O O - 0 349 498 455 474
|
||||
0 plötzlich O O - 0 508 566 455 475
|
||||
0 ein O O - 0 576 598 455 472
|
||||
0 Argwohn O O - 0 604 676 455 475
|
||||
0 oder O O - 0 686 710 455 471
|
||||
0 eine O O - 0 711 756 455 471
|
||||
0 Befürchtung O O - 0 155 250 475 495
|
||||
0 gekommen O O - 0 259 338 475 495
|
||||
0 zu O O - 0 346 354 479 495
|
||||
0 ſein O O - 0 354 404 475 494
|
||||
0 , O O - 0 354 404 475 494
|
||||
0 als O O - 0 414 438 475 490
|
||||
0 ob O O - 0 449 467 474 490
|
||||
0 hinter O O - 0 476 522 474 493
|
||||
0 dem O O - 0 531 561 474 491
|
||||
0 Bemühen O O - 0 570 648 474 492
|
||||
0 , O O - 0 570 648 474 492
|
||||
0 Waddington B-PER O 117086630 0 660 756 474 493
|
||||
0 unb O O - 0 155 185 494 512
|
||||
0 Léon B-PER O 117619744 0 200 249 494 512
|
||||
0 Say I-PER O - 0 254 288 494 512
|
||||
0 zu O O - 0 308 324 498 512
|
||||
0 halten O O - 0 343 398 494 512
|
||||
0 , O O - 0 343 398 494 512
|
||||
0 dagegen O O - 0 410 477 492 512
|
||||
0 Lepère B-PER O 1012607569 0 492 544 493 512
|
||||
0 zu O O - 0 563 581 497 512
|
||||
0 entfernen O O - 0 600 678 492 511
|
||||
0 , O O - 0 600 678 492 511
|
||||
0 die O O - 0 693 718 492 509
|
||||
0 Ab O O - 0 724 756 492 509
|
||||
0 ſicht O O - 0 156 187 513 531
|
||||
0 ſtecke O O - 0 206 250 513 531
|
||||
0 , O O - 0 206 250 513 531
|
||||
0 das O O - 0 268 296 513 529
|
||||
0 neue O O - 0 316 349 516 529
|
||||
0 Miniſterium O O - 0 367 463 511 529
|
||||
0 von O O - 0 482 509 515 528
|
||||
0 dem O O - 0 529 559 512 528
|
||||
0 bisher O O - 0 566 632 511 530
|
||||
0 dominirenden O O - 0 653 756 511 528
|
||||
0 Einfluß O O - 0 156 216 531 550
|
||||
0 des O O - 0 240 266 532 548
|
||||
0 Palais B-LOC O 4342820-4 0 293 346 530 550
|
||||
0 Bourbon I-LOC O - 0 368 437 530 546
|
||||
0 frei O O - 0 462 488 530 549
|
||||
0 zu O O - 0 511 528 535 550
|
||||
0 machen O O - 0 552 610 530 549
|
||||
0 . O O - 0 552 610 530 549
|
||||
0 O O - 0 644 682 529 546
|
||||
0 Sein O O - 0 644 682 529 546
|
||||
0 Beſuch O O - 0 706 756 530 548
|
||||
0 bei O O - 0 159 189 550 567
|
||||
0 Grévy B-PER O 119064693 0 195 246 551 569
|
||||
0 am O O - 0 262 285 554 566
|
||||
0 Sonntag O O - 0 300 368 550 569
|
||||
0 Morgen O O - 0 380 442 549 569
|
||||
0 um O O - 0 457 482 553 565
|
||||
0 10 O O - 0 496 514 550 565
|
||||
0 Uhr O O - 0 525 546 549 568
|
||||
0 ſoll O O - 0 546 593 549 569
|
||||
0 keineswegs O O - 0 607 691 548 567
|
||||
0 erbeten O O - 0 703 756 549 565
|
||||
0 ſondern O O - 0 163 216 570 586
|
||||
0 — O O - 0 225 243 577 580
|
||||
0 zum O O - 0 254 285 573 587
|
||||
0 erſten O O - 0 295 335 569 587
|
||||
0 Mal O O - 0 345 386 567 587
|
||||
0 ! O O - 0 345 386 567 587
|
||||
0 — O O - 0 396 414 576 578
|
||||
0 freiwillig O O - 0 418 493 567 587
|
||||
0 und O O - 0 508 537 568 584
|
||||
0 ziemlich O O - 0 542 605 567 587
|
||||
0 unerwartet O O - 0 615 697 568 583
|
||||
0 erfolgt O O - 0 707 756 567 586
|
||||
0 ſein O O - 0 156 190 586 606
|
||||
0 . O O - 0 156 190 586 606
|
||||
0 O O - 0 209 237 588 604
|
||||
0 Was O O - 0 209 237 588 604
|
||||
0 zwiſchen O O - 0 238 317 587 606
|
||||
0 den O O - 0 327 353 587 603
|
||||
0 beiden O O - 0 362 408 587 603
|
||||
0 Präſidenten O O - 0 418 508 586 606
|
||||
0 verhandelt O O - 0 523 602 587 606
|
||||
0 worden O O - 0 611 671 586 604
|
||||
0 , O O - 0 611 671 586 604
|
||||
0 weiß O O - 0 687 723 586 604
|
||||
0 na⸗ O O - 0 732 756 590 602
|
||||
0 türlich O O - 0 157 205 606 624
|
||||
0 Niemand O O - 0 217 289 607 624
|
||||
0 , O O - 0 217 289 607 624
|
||||
0 wenn O O - 0 300 339 609 623
|
||||
0 nicht O O - 0 349 383 605 624
|
||||
0 Herr O O - 0 393 429 606 624
|
||||
0 Gambetta B-PER O 118716263 0 434 509 606 622
|
||||
0 ſelbſt O O - 0 519 557 604 624
|
||||
0 es O O - 0 566 582 607 621
|
||||
0 hinterher O O - 0 588 656 605 623
|
||||
0 beim O O - 0 666 700 605 621
|
||||
0 Früh O O - 0 710 756 604 624
|
||||
0 — O O - 0 710 756 604 624
|
||||
0 ftück O O - 0 157 189 625 643
|
||||
0 ſeinem O O - 0 199 248 624 643
|
||||
0 Intimus O O - 0 257 330 625 643
|
||||
0 , O O - 0 257 330 625 643
|
||||
0 dem O O - 0 339 370 624 640
|
||||
0 Schauſpieler O O - 0 380 476 624 643
|
||||
0 Coquelin B-PER O 116670673 0 491 559 624 642
|
||||
0 dem O O - 0 575 605 624 640
|
||||
0 „ O O - 0 620 714 623 642
|
||||
0 Jüngeren O O - 0 620 714 623 642
|
||||
0 “ O O - 0 620 714 623 642
|
||||
0 von O O - 0 728 756 626 639
|
||||
0 der O O - 0 157 181 643 660
|
||||
0 Comédie B-ORG O 16295404-9 0 197 262 643 661
|
||||
0 françaiſe I-ORG O - 0 277 345 642 661
|
||||
0 anvertraut O O - 0 359 440 644 659
|
||||
0 hat O O - 0 455 484 644 661
|
||||
0 . O O - 0 455 484 644 661
|
||||
0 O O - 0 503 560 642 659
|
||||
0 Abends O O - 0 503 560 642 659
|
||||
0 im O O - 0 576 595 642 658
|
||||
0 Theater O O - 0 604 665 642 661
|
||||
0 ſpielte O O - 0 665 724 642 661
|
||||
0 der O O - 0 733 756 642 658
|
||||
0 Allgewaltige O O - 0 157 252 662 682
|
||||
0 freilich O O - 0 262 312 662 681
|
||||
0 wieder O O - 0 326 375 661 678
|
||||
0 den O O - 0 389 415 662 678
|
||||
0 Unbefangenen O O - 0 425 530 662 681
|
||||
0 und O O - 0 544 572 661 677
|
||||
0 Ununterrichteten O O - 0 582 711 661 679
|
||||
0 , O O - 0 582 711 661 679
|
||||
0 denn O O - 0 720 755 661 677
|
||||
0 er O O - 0 158 172 686 697
|
||||
0 leugnete O O - 0 182 242 682 700
|
||||
0 ſogar O O - 0 256 296 681 699
|
||||
0 ſeinen O O - 0 312 356 680 699
|
||||
0 Beſuch O O - 0 366 416 681 699
|
||||
0 vom O O - 0 433 465 683 696
|
||||
0 Vormittag O O - 0 481 566 679 699
|
||||
0 , O O - 0 481 566 679 699
|
||||
0 obwohl O O - 0 583 638 681 698
|
||||
0 Hunderte O O - 0 646 716 679 699
|
||||
0 das O O - 0 728 755 679 695
|
||||
0 wohlbekannte O O - 0 157 258 700 718
|
||||
0 kleine O O - 0 271 312 698 715
|
||||
0 Coupé O O - 0 322 371 699 718
|
||||
0 Gambettas B-PER O 119064693 0 382 466 698 716
|
||||
0 eine O O - 0 482 510 699 715
|
||||
0 ganze O O - 0 525 566 702 718
|
||||
0 Stunde O O - 0 577 633 698 715
|
||||
0 lang O O - 0 648 681 698 715
|
||||
0 von O O - 0 695 712 701 714
|
||||
0 der O O - 0 714 756 698 714
|
||||
0 Rue B-LOC O - 0 157 189 718 735
|
||||
0 du I-LOC O - 0 204 222 719 735
|
||||
0 Faubourg I-LOC O - 0 232 308 718 736
|
||||
0 St I-LOC O - 0 324 351 718 735
|
||||
0 . I-LOC O - 0 324 351 718 735
|
||||
0 Honoré I-LOC O - 0 360 418 718 736
|
||||
0 aus O O - 0 434 462 720 733
|
||||
0 im O O - 0 476 496 718 734
|
||||
0 Vorhof O O - 0 505 562 717 735
|
||||
0 des O O - 0 577 602 718 733
|
||||
0 Elyſée B-LOC O 4075880-1 0 612 661 717 736
|
||||
0 ſtationiren O O - 0 666 755 703 736
|
||||
0 geſehen O O - 0 158 211 737 756
|
||||
0 hatten O O - 0 222 273 737 754
|
||||
0 . O O - 0 222 273 737 754
|
||||
0 O O - 0 292 321 737 753
|
||||
0 Der O O - 0 292 321 737 753
|
||||
0 Erfolg O O - 0 331 382 736 756
|
||||
0 dieſer O O - 0 392 432 736 755
|
||||
0 Viſite O O - 0 437 480 736 754
|
||||
0 war O O - 0 490 520 740 753
|
||||
0 denn O O - 0 530 565 736 752
|
||||
0 auch O O - 0 574 606 736 754
|
||||
0 ſchon O O - 0 616 655 735 755
|
||||
0 in O O - 0 665 679 735 752
|
||||
0 derſelben O O - 0 689 756 736 754
|
||||
0 Zeit O O - 0 157 189 756 775
|
||||
0 zu O O - 0 198 214 760 775
|
||||
0 ſpüren O O - 0 224 277 755 774
|
||||
0 , O O - 0 224 277 755 774
|
||||
0 da O O - 0 287 305 755 772
|
||||
0 derjenige O O - 0 314 392 755 774
|
||||
0 , O O - 0 314 392 755 774
|
||||
0 der O O - 0 396 419 756 771
|
||||
0 ſie O O - 0 429 445 755 774
|
||||
0 gemacht O O - 0 455 519 756 774
|
||||
0 , O O - 0 455 519 756 774
|
||||
0 ſie O O - 0 533 550 754 773
|
||||
0 ableugnen O O - 0 565 641 756 774
|
||||
0 wollte O O - 0 651 702 754 770
|
||||
0 . O O - 0 651 702 754 770
|
||||
0 O O - 0 720 756 754 774
|
||||
0 Herr O O - 0 720 756 754 774
|
||||
0 Lepère B-PER O 1012607569 0 156 212 774 793
|
||||
0 , O O - 0 156 212 774 793
|
||||
0 der O O - 0 227 250 774 790
|
||||
0 bereits O O - 0 264 314 774 790
|
||||
0 ſeine O O - 0 331 374 773 792
|
||||
0 Siebenſachen O O - 0 382 480 773 792
|
||||
0 zuſammengepackt O O - 0 494 623 773 793
|
||||
0 hatte O O - 0 638 679 773 791
|
||||
0 , O O - 0 638 679 773 791
|
||||
0 weil O O - 0 696 727 773 789
|
||||
0 er O O - 0 743 756 777 789
|
||||
0 glaubte O O - 0 157 211 793 811
|
||||
0 ausziehen O O - 0 221 295 793 811
|
||||
0 zu O O - 0 305 322 797 811
|
||||
0 müſſen O O - 0 332 383 793 811
|
||||
0 — O O - 0 393 412 801 803
|
||||
0 Freycinet B-PER O 118703099 0 421 493 793 811
|
||||
0 ſelbſt O O - 0 496 544 792 811
|
||||
0 hatte O O - 0 554 590 792 810
|
||||
0 ihm O O - 0 600 629 793 809
|
||||
0 das O O - 0 639 666 792 808
|
||||
0 zu O O - 0 675 692 796 811
|
||||
0 wieder O O - 0 702 756 791 808
|
||||
0 — O O - 0 702 756 791 808
|
||||
0 holten O O - 0 156 202 810 830
|
||||
0 Malen O O - 0 212 262 811 828
|
||||
0 in O O - 0 272 287 811 828
|
||||
0 dürren O O - 0 297 347 812 827
|
||||
0 Worten O O - 0 357 415 812 827
|
||||
0 geſagt O O - 0 425 475 811 830
|
||||
0 — O O - 0 484 503 819 822
|
||||
0 Herr O O - 0 512 548 811 830
|
||||
0 Lepre B-PER O 1012607569 0 556 607 811 829
|
||||
0 erhielt O O - 0 616 664 811 830
|
||||
0 von O O - 0 674 701 814 826
|
||||
0 Gam B-PER O 118716263 0 711 755 811 827
|
||||
0 — I-PER O - 0 711 755 811 827
|
||||
0 betta I-PER O - 0 156 192 829 846
|
||||
0 die O O - 0 202 224 830 846
|
||||
0 Nachricht O O - 0 234 308 830 848
|
||||
0 , O O - 0 234 308 830 848
|
||||
0 daß O O - 0 318 346 830 848
|
||||
0 er O O - 0 356 370 835 846
|
||||
0 bleiben O O - 0 380 432 830 846
|
||||
0 dürfe O O - 0 445 488 830 848
|
||||
0 . O O - 0 445 488 830 848
|
||||
0 O O - 0 508 592 830 848
|
||||
0 Gleichzeitig O O - 0 508 592 830 848
|
||||
0 wurde O O - 0 602 649 829 845
|
||||
0 Herrn O O - 0 658 703 829 848
|
||||
0 Waddington B-PER O 117086630 0 714 756 829 845
|
||||
0 das O O - 0 230 257 849 865
|
||||
0 Gegentheil O O - 0 272 354 848 867
|
||||
0 bedeutet O O - 0 370 437 849 867
|
||||
0 ; O O - 0 370 437 849 867
|
||||
0 den O O - 0 451 476 849 864
|
||||
0 Botſchafterpoſten O O - 0 486 617 848 867
|
||||
0 in O O - 0 633 648 848 864
|
||||
0 London B-LOC O 4074335-4 0 658 716 848 864
|
||||
0 , O O - 0 658 716 848 864
|
||||
0 der O O - 0 720 756 848 866
|
||||
0 ihm O O - 0 156 185 866 885
|
||||
0 als O O - 0 196 219 868 884
|
||||
0 Entſchädigung O O - 0 230 339 867 886
|
||||
0 angeboten O O - 0 350 426 868 886
|
||||
0 wurde O O - 0 436 486 868 884
|
||||
0 , O O - 0 436 486 868 884
|
||||
0 ſchlug O O - 0 496 539 867 886
|
||||
0 er O O - 0 549 563 872 883
|
||||
0 aus O O - 0 573 605 869 883
|
||||
0 . O O - 0 573 605 869 883
|
||||
0 O O - 0 625 648 866 882
|
||||
0 Von O O - 0 625 648 866 882
|
||||
0 allen O O - 0 649 699 868 882
|
||||
0 dieſen O O - 0 699 756 866 884
|
||||
0 Vorgängen O O - 0 159 244 885 905
|
||||
0 erhielt O O - 0 254 305 886 904
|
||||
0 Léon B-PER O 117619744 0 310 350 885 902
|
||||
0 Say I-PER O - 0 360 394 886 905
|
||||
0 erſt O O - 0 407 432 886 902
|
||||
0 in O O - 0 445 460 886 902
|
||||
0 ſpäter O O - 0 475 519 886 903
|
||||
0 Nachmittagsſtunde O O - 0 528 671 885 905
|
||||
0 Kenntniß O O - 0 682 756 885 903
|
||||
0 . O O - 0 682 756 885 903
|
||||
0 O O - 0 161 198 904 921
|
||||
0 Sein O O - 0 161 198 904 921
|
||||
0 Entſchluß O O - 0 208 281 904 923
|
||||
0 war O O - 0 297 328 908 920
|
||||
0 ſofort O O - 0 343 391 905 923
|
||||
0 gefaßt O O - 0 400 451 903 923
|
||||
0 . O O - 0 400 451 903 923
|
||||
0 O O - 0 471 519 905 923
|
||||
0 Gegen O O - 0 471 519 905 923
|
||||
0 6 O O - 0 535 544 907 920
|
||||
0 Uhr O O - 0 560 589 905 922
|
||||
0 Abends O O - 0 599 656 904 920
|
||||
0 fuhr O O - 0 666 690 904 922
|
||||
0 er O O - 0 692 723 909 920
|
||||
0 ins O O - 0 733 756 904 919
|
||||
0 Elyſée B-LOC O 4075880-1 0 158 207 923 942
|
||||
0 und O O - 0 220 248 924 939
|
||||
0 legte O O - 0 264 299 924 940
|
||||
0 ſein O O - 0 313 340 923 940
|
||||
0 Portefeuille O O - 0 355 445 923 942
|
||||
0 in O O - 0 461 475 923 939
|
||||
0 Grevys B-PER O 119064693 0 490 546 923 942
|
||||
0 Hände O O - 0 557 606 923 942
|
||||
0 zurück O O - 0 621 671 923 942
|
||||
0 . O O - 0 621 671 923 942
|
Can't render this file because it has a wrong number of fields in line 2.
|
44
neath.html
44
neath.html
|
@ -3,33 +3,35 @@
|
|||
<head>
|
||||
<meta charset="UTF-8">
|
||||
<title>neath</title>
|
||||
<base href="neath.html" target="_blank">
|
||||
<link rel="stylesheet" href="https://stackpath.bootstrapcdn.com/bootstrap/4.3.1/css/bootstrap.min.css"
|
||||
integrity="sha384-ggOyR0iXCbMQv3Xipma34MD+dH/1fQ784/j6cY/iJTQUOhcWr7x9JvoRxT2MZw1T" crossorigin="anonymous">
|
||||
<script type="text/javascript" src="https://cdnjs.cloudflare.com/ajax/libs/PapaParse/5.0.1/papaparse.js"></script>
|
||||
<script type="text/javascript" src="https://cdnjs.cloudflare.com/ajax/libs/PapaParse/5.1.0/papaparse.min.js"></script>
|
||||
<script type="text/javascript" src="https://cdnjs.cloudflare.com/ajax/libs/keypress/2.1.5/keypress.min.js"></script>
|
||||
<script type="text/javascript" src="https://cdnjs.cloudflare.com/ajax/libs/jquery/3.4.1/jquery.min.js"></script>
|
||||
<script type="text/javascript" src="http://code.jquery.com/ui/1.12.1/jquery-ui.min.js"></script>
|
||||
<script type="text/javascript" src="https://cdnjs.cloudflare.com/ajax/libs/jqueryui/1.12.1/jquery-ui.min.js"></script>
|
||||
<style>
|
||||
body{font-family:Verdana;font-size:16px}
|
||||
table{table-layout:fixed;width:100%;text-align:center}
|
||||
th{background-color:lightgray}
|
||||
.editable:hover{background-color:yellow}
|
||||
tr:hover{background-color:whitesmoke}
|
||||
.editable:focus{background-color:#f0e442}
|
||||
tr:focus-within{background-color:#dddddd}
|
||||
|
||||
.accordion:hover .accordion-item:hover .accordion-item-content,
|
||||
.accordion .accordion-item--default .accordion-item-content{height:9em;}
|
||||
.accordion-item-content, .accordion:hover .accordion-item-content{height:0;overflow:hidden;transition:height.25s;}
|
||||
.accordion{padding:0;margin:0auto;width:100px;}
|
||||
.accordion-item:hover{background-color:yellow;}
|
||||
.accordion .accordion-item--default .accordion-item-content{height:10.5em}
|
||||
.accordion-item-content, .accordion:hover .accordion-item-content{height:0;overflow:hidden;transition:height.25s}
|
||||
.accordion{padding:0;margin:auto;width:100px}
|
||||
.accordion-item:hover{background-color:#f0e442}
|
||||
|
||||
.type_select:hover{background-color:yellow;}
|
||||
.type_select:hover{background-color:#f0e442}
|
||||
|
||||
.ner_per{background-color:skyblue}
|
||||
.ner_loc{background-color:goldenrod}
|
||||
.ner_org{background-color:plum}
|
||||
.ner_pub{background-color:lightgreen}
|
||||
.ner_conf{background-color:olive}
|
||||
.ner_art{background-color:lavender}
|
||||
.ner_todo{background-color:turquoise}
|
||||
.ner_per{background-color:#56b3e9}
|
||||
.ner_loc{background-color:#e69d00}
|
||||
.ner_org{background-color:#df6caa}
|
||||
.ner_work{background-color:#009e74}
|
||||
.ner_conf{background-color:#0072b2}
|
||||
.ner_evt{background-color:#a60a2d}
|
||||
.ner_todo{background-color:#d55e00}
|
||||
|
||||
.fit-image{
|
||||
width: 100%;
|
||||
|
@ -53,8 +55,8 @@
|
|||
<div class="col-9">
|
||||
<div class="row">
|
||||
<div class="col text-center">
|
||||
<h3><a href="https://github.com/qurator-spk/neath" target="_blank">neath</a>: named entity annotation tool in html</h3>
|
||||
<a href="https://github.com/qurator-spk/neath/blob/master/docs/User_Guide.md" target="_blank">User Guide</a> | <a href="https://github.com/qurator-spk/neath/blob/master/docs/Annotation_Guidelines.md" target="_blank">Annotation Guidelines</a> | <a href="https://github.com/qurator-spk/neath/issues" target="_blank">Issues</a><hr>
|
||||
<h3><a href="https://github.com/qurator-spk/neath" target="_blank" tabindex="-1">neath</a>: named entity annotation tool</h3>
|
||||
<a href="https://github.com/qurator-spk/neath/blob/master/README.md#2-user-guide" target="_blank" tabindex="-1">User Guide</a> | <a href="https://github.com/qurator-spk/neath/blob/master/Annotation_Guidelines.pdf" target="_blank" tabindex="-1">Annotation Guidelines</a> | <a href="https://github.com/qurator-spk/neath/issues" target="_blank" tabindex="-1">Issues</a><hr>
|
||||
</div>
|
||||
</div>
|
||||
</div>
|
||||
|
@ -62,13 +64,13 @@
|
|||
</div>
|
||||
</div>
|
||||
<div class="row mt-3">
|
||||
<div class="col-2" id="region-left">
|
||||
<div class="col-3" id="region-left">
|
||||
<a href="" id="preview-link">
|
||||
<img id="preview" alt="facsimile_preview" class="img-responsive fit-image"/>
|
||||
</a>
|
||||
</div>
|
||||
<div class="col-9 text-center" id="tableregion">
|
||||
Please upload a TSV file:
|
||||
<div class="col-8 text-center" id="tableregion">
|
||||
Please upload a TSV<sup>(<a href="https://github.com/qurator-spk/neath/blob/master/User_Guide.md#22-data-format">i</a>)</sup> file:
|
||||
<br><br>
|
||||
<input type="file" id="tsv-file" name="files"/>
|
||||
</div>
|
||||
|
|
Loading…
Add table
Add a link
Reference in a new issue