mirror of
				https://github.com/qurator-spk/neat.git
				synced 2025-10-26 06:14:15 +01:00 
			
		
		
		
	downloaded old neath
This commit is contained in:
		
							parent
							
								
									1fe2479b6f
								
							
						
					
					
						commit
						62f2f4963b
					
				
					 5 changed files with 1145 additions and 264 deletions
				
			
		
							
								
								
									
										
											BIN
										
									
								
								Annotation_Guidelines.pdf
									
										
									
									
									
										Normal file
									
								
							
							
						
						
									
										
											BIN
										
									
								
								Annotation_Guidelines.pdf
									
										
									
									
									
										Normal file
									
								
							
										
											Binary file not shown.
										
									
								
							
							
								
								
									
										195
									
								
								README.md
									
										
									
									
									
								
							
							
						
						
									
										195
									
								
								README.md
									
										
									
									
									
								
							|  | @ -1,5 +1,192 @@ | |||
| # neath: named entity annotation tool in html | ||||
| [User Guide](docs/User_Guide.md) | [Anntotation Guidelines](docs/Annotation_Guidelines.md) | ||||
| 
 | ||||
| # neath: named entity annotation tool | ||||
| #### version 0.1 | ||||
| --- | ||||
|  | ||||
|  | ||||
| --- | ||||
| 
 | ||||
| ### Table of contents | ||||
| [1. Introduction](https://github.com/qurator-spk/neath/blob/master/README.md#1-introduction)  | ||||
| 
 | ||||
| [2. User Guide](https://github.com/qurator-spk/neath/blob/master/README.md#2-user-guide) | ||||
| 
 | ||||
|    [2.1 Technical requirements](https://github.com/qurator-spk/neath/blob/master/README.md#21-technical-requirements)  | ||||
| 
 | ||||
|    [2.2 Installation](https://github.com/qurator-spk/neath/blob/master/README.md#22-installation)  | ||||
|      | ||||
|    [2.3 Data format](https://github.com/qurator-spk/neath/blob/master/README.md#23-data-format) | ||||
|      | ||||
|    [2.4 Data preparation](https://github.com/qurator-spk/neath/blob/master/README.md#24-data-preparation) | ||||
|      | ||||
|    [2.5 Provenance](https://github.com/qurator-spk/neath/blob/master/README.md#25-provenance) | ||||
|      | ||||
|    [2.6 Keyboard navigation](https://github.com/qurator-spk/neath/blob/master/README.md#26-keyboard-navigation) | ||||
|      | ||||
|    [2.7 Mouse navigation](https://github.com/qurator-spk/neath/blob/master/README.md#27-mouse-navigation) | ||||
|      | ||||
|    [2.8 Image support](https://github.com/qurator-spk/neath/blob/master/README.md#28-image-support) | ||||
|      | ||||
|    [2.9 Saving progress](https://github.com/qurator-spk/neath/blob/master/README.md#29-saving-progress) | ||||
| 
 | ||||
| [3. Annotation Guidelines](https://github.com/qurator-spk/neath/blob/master/README.md#3-annotation-guidelines) | ||||
| 
 | ||||
| ### 1. Introduction | ||||
| [neath](https://github.com/qurator-spk/neath) is a simple, browser-based tool for editing and annotating text with named entities to produce a corpus for training/testing/evaluation. It can be used to add or correct named entity BIO-tags in a TSV file and to correct the token text or tokenization (e.g. due to OCR/segmentation errors).  | ||||
| 
 | ||||
| [neath](https://github.com/qurator-spk/neath) is developed at the [Berlin State Library](https://staatsbibliothek-berlin.de/) for data annotation in the context of the [SoNAR-IDH](https://sonar.fh-potsdam.de/) project and the [QURATOR](https://qurator.ai/) project. | ||||
| 
 | ||||
| ### 2. User Guide | ||||
| 
 | ||||
| #### 2.1 Technical Requirements  | ||||
| [neath](https://github.com/qurator-spk/neath) runs locally as a pure HTML+JavaScript webpage in your web browser. No software needs to be installed, but JavaScript has to be enabled in the browser.  | ||||
| 
 | ||||
| #### 2.2. Installation | ||||
| Simply clone the repo using ``git clone https://github.com/qurator-spk/neath.git`` or download the [ZIP](https://github.com/qurator-spk/neath/archive/master.zip). Make sure you have at minimum ``neath.html`` and ``neath.js`` residing in a local directory, then it is sufficient to just open ``neath.html`` in a browser. Any fairly recent browser should work, but only Chrome and Firefox are tested. | ||||
| 
 | ||||
| #### 2.3 Data format    | ||||
| The data format is based on the format used in the [GermEval2014 Named Entity Recognition Shared Task](https://sites.google.com/site/germeval2014ner/data). Text is encoded as one token per line, with name spans encoded in the BIO-scheme, provided as tab-separated values: | ||||
| * the first column contains either a `#`, which signals the source the sentence is cited from, or  | ||||
| * the token position within the sentence ``>=1`` | ||||
| * sentence boundaries are indicated by ``0`` | ||||
| * the second column contains the token ``text``  | ||||
| * outer entity spans are encoded in the third column ``NE-TAG`` | ||||
| * embedded entity spans are encoded in the fourth column ``NE-EMB``  | ||||
| 
 | ||||
| Example (simple): | ||||
| ```tsv | ||||
| No.	TOKEN	NE-TAG	NE-EMB | ||||
| # https://example.url | ||||
| 1	Donnerstag	O	O | ||||
| 2	,	O	O | ||||
| 3	1	O	O	 | ||||
| 4	.	O	O	 | ||||
| 5	Januar	O	O	 | ||||
| 6	.	O	O		 | ||||
| 0		O	O | ||||
| 1	Berliner	B-ORG	B-LOC	 | ||||
| 2	Tageblatt	I-ORG	O	 | ||||
| 3	.	O	O		 | ||||
| 0		O	O | ||||
| 1	Nr	O	O	 | ||||
| 2	.	O	O		 | ||||
| 3	1	O	O	 | ||||
| 4	.	O	O	 | ||||
| 0		O	O | ||||
| 1	Seite	O	O | ||||
| 2	3	O	O | ||||
| ``` | ||||
| 
 | ||||
| For our purposes we extend this format by adding | ||||
| * a fifth column for an ``ID`` for the outer ``NE-TAG`` from an authority file (in this case, the [GND](https://www.dnb.de/EN/Professionell/Standardisierung/GND/gnd_node.html) is used)  | ||||
| * column six for use as a variable ``url_id`` (see [Image Support](https://github.com/qurator-spk/neath/blob/master/README.md#28-image-support) for further details) | ||||
| * finally, columns 7+ are used for storing ``left,right,top,bottom`` pixel coordinates for facsimile snippets  | ||||
| 
 | ||||
| Example (full): | ||||
| ```tsv | ||||
| No.	TOKEN	NE-TAG	NE-EMB	GND-ID	url_id	left,right,top,bottom | ||||
| # https://example.url/iiif/left,right,top,bottom/full/0/default.jpg | ||||
| 1	Donnerstag	O	O	-	0	174,352,358,390 | ||||
| 2	,	O	O	-	0	174,352,358,390	 | ||||
| 3	1	O	O	-	0	367,392,361,381 | ||||
| 4	.	O	O	-	0	370,397,352,379 | ||||
| 5	Januar	O	O	-	0	406,518,358,386 | ||||
| 6	.	O	O	-	0	406,518,358,386	 | ||||
| 0 | ||||
| 1	Berliner	B-ORG	B-LOC	1086206452	0	816,984,358,388 | ||||
| 2	Tageblatt	I-ORG	O	1086206452	0	1005,1208,360,387 | ||||
| 3	.	O	O	-	0	1005,1208,360,387 | ||||
| 0 | ||||
| 1	Nr	O	O	-	0	1237,1288,360,382 | ||||
| 2	.	O	O	-	0	1237,1288,360,382 | ||||
| 3	1	O	O	-	0	1304,1326,361,381 | ||||
| 4	.	O	O	-	0	1304,1326,361,381 | ||||
| 0 | ||||
| 1	Seite	O	O	-	0	1837,1926,361,392 | ||||
| 2	3	O	O	-	0	1939,1967,364,385 | ||||
| ``` | ||||
| 
 | ||||
| #### 2.4 Data preparation   | ||||
| The source data that is used for annotation are OCR results in [PAGE-XML](https://github.com/PRImA-Research-Lab/PAGE-XML) format. We provide a [Python tool](https://github.com/qurator-spk/page2tsv) that supports the transformation of [PAGE-XML](https://github.com/PRImA-Research-Lab/PAGE-XML) OCR files into the [TSV format](https://github.com/qurator-spk/neath/blob/master/README.md#23-data-format) required for use with [neath](https://github.com/qurator-spk/neath). | ||||
| 
 | ||||
| #### 2.5 Provenance | ||||
| The processing pipeline applied at the Berlin State Library comprises the follows steps:  | ||||
| 
 | ||||
| 1. Layout Analysis & Textline Extraction        | ||||
| Layout Analysis & Textline Extraction @[sbb_textline_detector](https://github.com/qurator-spk/sbb_textline_detector) | ||||
| 2. OCR & Word Segmentation     | ||||
| OCR is based on [OCR-D](https://github.com/OCR-D)'s [ocrd_tesserocr](https://github.com/OCR-D/ocrd_tesserocr) which requires [Tesseract](https://github.com/tesseract-ocr/tesseract) **>= 4.1.0**. The [GT4HistOCR_2000000](https://ub-backup.bib.uni-mannheim.de/~stweil/ocrd-train/data/GT4HistOCR_2000000.traineddata) model, which is [trained](https://github.com/tesseract-ocr/tesstrain/wiki/GT4HistOCR) on the [GT4HistOCR](https://zenodo.org/record/1344132) corpus, is used. Further details are available in the [paper](https://arxiv.org/abs/1809.05501). | ||||
| 3. TSV Transformation    | ||||
| A simple [Python tool](https://github.com/qurator-spk/page2tsv) is used for the transformation of the OCR results in [PAGE-XML](https://github.com/PRImA-Research-Lab/PAGE-XML) to [TSV](https://github.com/qurator-spk/neath/blob/master/docs/README.md#23-data-format). | ||||
| 4. Tokenization     | ||||
| For tokenization, [SoMaJo](https://github.com/tsproisl/SoMaJo) is used. | ||||
| 5. Named Entity Recognition     | ||||
| For Named Entity Recognition, a [BERT-Base](https://github.com/google-research/bert) model was trained for noisy OCR texts with historical spelling variation. [sbb_ner](https://github.com/qurator-spk/sbb_ner) is using a combination of unsupervised training on a large (~2.3m pages) [corpus of German OCR](https://zenodo.org/record/3257041) in combination with supervised training on a small (47k tokens) [annotated corpus](https://github.com/EuropeanaNewspapers/ner-corpora/tree/master/enp_DE.sbb.bio). Further details are available in the [paper](https://corpora.linguistik.uni-erlangen.de/data/konvens/proceedings/papers/KONVENS2019_paper_4.pdf). | ||||
| 
 | ||||
| #### 2.6 Keyboard-Navigation | ||||
| | Key Combination|      Action      | | ||||
| |:---------|:-------------------------------------------| | ||||
| | Left     |  Move one cell left                        | | ||||
| | Right    |  Move one cell right                       | | ||||
| | Up       |  Move one row up                           | | ||||
| | Down     |  Move one row down                         | | ||||
| | PageDown |  Move page down                            | | ||||
| | PageUp   |  Move page up                              | | ||||
| | Crtl+Up  |  Move entire table one row up              | | ||||
| | Crtl+Down|  Move entire table one row down            | | ||||
| |----------|--------------------------------------------| | ||||
| | s  t     |  Start new sentence in current row         | | ||||
| | m  e     |  Merge current row with row above          | | ||||
| | s  p     |  Create copy of current row                | | ||||
| | d  l     |  Delete current row                        | | ||||
| |----------|--------------------------------------------| | ||||
| | backspace|  Set NE-TAG / NE-EMB to "O"                | | ||||
| | b  p     |  Set NE-TAG / NE-EMB to "B-PER"            | | ||||
| | b  l     |  Set NE-TAG / NE-EMB to "B-LOC"            | | ||||
| | b  o     |  Set NE-TAG / NE-EMB to "B-ORG"            | | ||||
| | b  w     |  Set NE-TAG / NE-EMB to "B-WORK"           | | ||||
| | b  c     |  Set NE-TAG / NE-EMB to "B-CONF"           | | ||||
| | b  e     |  Set NE-TAG / NE-EMB to "B-EVT"            | | ||||
| | b  t     |  Set NE-TAG / NE-EMB to "B-TODO"           | | ||||
| | i  p     |  Set NE-TAG / NE-EMB to "I-PER"            | | ||||
| | i  l     |  Set NE-TAG / NE-EMB to "I-LOC"            | | ||||
| | i  o     |  Set NE-TAG / NE-EMB to "I-ORG"            | | ||||
| | i  w     |  Set NE-TAG / NE-EMB to "I-WORK"           | | ||||
| | i  c     |  Set NE-TAG / NE-EMB to "I-CONF"           |  | ||||
| | i  e     |  Set NE-TAG / NE-EMB to "I-EVT"            | | ||||
| | i  t     |  Set NE-TAG / NE-EMB to "I-TODO"           | | ||||
| |----------|--------------------------------------------| | ||||
| | enter    | Edit TOKEN or GND-ID                       | | ||||
| | esc      | Close TOKEN or GND-ID edit field without   | | ||||
| |          | application of changes.                    | | ||||
| |----------|--------------------------------------------| | ||||
| | l a      | add one display row                        | | ||||
| | l r      | remove on display row (minimum is 5)       | | ||||
| |----------|--------------------------------------------| | ||||
| 
 | ||||
| #### 2.7 Mouse-Navigation | ||||
| * use mouse wheel to scroll up and down | ||||
| 
 | ||||
| * left-click `<<` and `>>` to move 15 rows up or down | ||||
| 
 | ||||
| * left-click `O` in the `NE-TAG` or `NE-EMB` columns to open the drop-down menu and select any of the supported NE-Tags to tag a token or change an existing tag to another one | ||||
| 
 | ||||
| * left-click a tag in the `NE-TAG` or `NE-EMB` columns and subsequently select `O` to remove a wrong tag | ||||
| 
 | ||||
| * left-click a token in the `TOKEN` column to edit/correct the text content | ||||
| 
 | ||||
| * left-click the `POSITION` of a row and select `split` from the drop-down menu to create a copy of the current row | ||||
| 
 | ||||
| * left-click the `POSITION` of a row and select `merge` from the drop-down menu to merge the current row with the row above | ||||
| 
 | ||||
| * left-click the `POSITION` of a row and select `start-sentence` from the drop-down menu to start a new sentence | ||||
| 
 | ||||
| #### 2.8 Image Support | ||||
| Provided facsimile images are available online via the [iiif.io](https://iiif.io/) Image API, [neath](https://github.com/qurator-spk/neath) supports the embedding of facsimile snippets into its interface to help with data annotation and correction.  | ||||
| This further requires that OCR with word segmentation is applied to the image to determine bounding boxes for tokens.  | ||||
| 
 | ||||
| The iiif-image-url contained in the source ``#`` can then be used as a replacement for ``url_id`` in combination with the token bounding boxes as ``left,right,top,bottom`` to obtain the facsimile snippet url and display the image in the leftmost column. Clicking on the facsimile snippet opens up a new tab with a larger context. | ||||
| 
 | ||||
| #### 2.9 Saving progress | ||||
| [neath](https://github.com/qurator-spk/neath) runs fully locally in the browser. Therefore it can not automatically save any changes you made to disk. You have to use the `Save Changes` button in order to so manually from time to time. If your browser automatically saves all downloads to your `Downloads` folder, you might want to configure it so that it instead prompts you where to save. | ||||
| 
 | ||||
| ### 3. Annotation Guidelines | ||||
| The most recent version of the [Annotation Guidelines](https://github.com/qurator-spk/neath/blob/master/Annotation_Guidelines.pdf) is included in this repository.  | ||||
|  |  | |||
							
								
								
									
										351
									
								
								example.tsv
									
										
									
									
									
										Normal file
									
								
							
							
						
						
									
										351
									
								
								example.tsv
									
										
									
									
									
										Normal file
									
								
							|  | @ -0,0 +1,351 @@ | |||
| No.	TOKEN	NE-TAG	NE-EMB	GND-ID	url_id	left	right	top	bottom | ||||
| # https://content.staatsbibliothek-berlin.de/zefys/SNP27646518-18800101-0-3-0-0/left,top,width,height/full/0/default.jpg | ||||
| 0	Kampf	O	O	-	0	154	212	400	419 | ||||
| 0	,	O	O	-	0	154	212	400	419 | ||||
| 0	deſſen	O	O	-	0	221	264	400	419 | ||||
| 0	Ende	O	O	-	0	274	313	401	417 | ||||
| 0	vielleicht	O	O	-	0	324	388	399	418 | ||||
| 0	noch	O	O	-	0	397	429	400	418 | ||||
| 0	heute	O	O	-	0	439	478	400	418 | ||||
| 0	nicht	O	O	-	0	487	523	399	417 | ||||
| 0	abzuſehen	O	O	-	0	532	605	399	418 | ||||
| 0	wäre	O	O	-	0	615	656	399	417 | ||||
| 0	,	O	O	-	0	615	656	399	417 | ||||
| 0	wenn	O	O	-	0	671	701	402	415 | ||||
| 0	nicht	O	O	-	0	702	755	399	417 | ||||
| 0	Herr	O	O	-	0	155	192	419	437 | ||||
| 0	Gambetta	B-PER	O	118716263	0	202	277	419	437 | ||||
| 0	als	O	O	-	0	287	311	420	436 | ||||
| 0	deus	O	O	-	0	320	357	419	434 | ||||
| 0	ex	O	O	-	0	366	385	422	434 | ||||
| 0	machina	O	O	-	0	395	451	419	434 | ||||
| 0	erſchienen	O	O	-	0	452	543	417	436 | ||||
| 0	wäre	O	O	-	0	553	594	417	437 | ||||
| 0	,	O	O	-	0	553	594	417	437 | ||||
| 0	reſp	O	O	-	0	608	642	418	437 | ||||
| 0	.	O	O	-	0	608	642	418	437 | ||||
| 0	durch	O	O	-	0	652	692	418	436 | ||||
| 0	perſön⸗	O	O	-	0	698	756	418	437 | ||||
| 0	liche	O	O	-	0	156	188	437	457 | ||||
| 0	Intervention	O	O	-	0	197	298	438	457 | ||||
| 0	bei	O	O	-	0	309	330	438	453 | ||||
| 0	dem	O	O	-	0	339	370	437	453 | ||||
| 0	Präſidenten	O	O	-	0	379	468	437	457 | ||||
| 0	Grévy	B-PER	O	119064693	0	475	524	436	456 | ||||
| 0	einen	O	O	-	0	534	572	437	453 | ||||
| 0	Ausgleich	O	O	-	0	577	650	437	455 | ||||
| 0	herbeigeführt	O	O	-	0	658	755	436	455 | ||||
| 0	hätte	O	O	-	0	155	207	457	475 | ||||
| 0	.	O	O	-	0	155	207	457	475 | ||||
| 0		O	O	-	0	216	239	457	474 | ||||
| 0	Es	O	O	-	0	216	239	457	474 | ||||
| 0	ſcheint	O	O	-	0	252	300	457	475 | ||||
| 0	dem	O	O	-	0	309	339	457	472 | ||||
| 0	Kammerpräſidenten	O	O	-	0	349	498	455	474 | ||||
| 0	plötzlich	O	O	-	0	508	566	455	475 | ||||
| 0	ein	O	O	-	0	576	598	455	472 | ||||
| 0	Argwohn	O	O	-	0	604	676	455	475 | ||||
| 0	oder	O	O	-	0	686	710	455	471 | ||||
| 0	eine	O	O	-	0	711	756	455	471 | ||||
| 0	Befürchtung	O	O	-	0	155	250	475	495 | ||||
| 0	gekommen	O	O	-	0	259	338	475	495 | ||||
| 0	zu	O	O	-	0	346	354	479	495 | ||||
| 0	ſein	O	O	-	0	354	404	475	494 | ||||
| 0	,	O	O	-	0	354	404	475	494 | ||||
| 0	als	O	O	-	0	414	438	475	490 | ||||
| 0	ob	O	O	-	0	449	467	474	490 | ||||
| 0	hinter	O	O	-	0	476	522	474	493 | ||||
| 0	dem	O	O	-	0	531	561	474	491 | ||||
| 0	Bemühen	O	O	-	0	570	648	474	492 | ||||
| 0	,	O	O	-	0	570	648	474	492 | ||||
| 0	Waddington	B-PER	O	117086630	0	660	756	474	493 | ||||
| 0	unb	O	O	-	0	155	185	494	512 | ||||
| 0	Léon	B-PER	O	117619744	0	200	249	494	512 | ||||
| 0	Say	I-PER	O	-	0	254	288	494	512 | ||||
| 0	zu	O	O	-	0	308	324	498	512 | ||||
| 0	halten	O	O	-	0	343	398	494	512 | ||||
| 0	,	O	O	-	0	343	398	494	512 | ||||
| 0	dagegen	O	O	-	0	410	477	492	512 | ||||
| 0	Lepère	B-PER	O	1012607569	0	492	544	493	512 | ||||
| 0	zu	O	O	-	0	563	581	497	512 | ||||
| 0	entfernen	O	O	-	0	600	678	492	511 | ||||
| 0	,	O	O	-	0	600	678	492	511 | ||||
| 0	die	O	O	-	0	693	718	492	509 | ||||
| 0	Ab	O	O	-	0	724	756	492	509 | ||||
| 0	ſicht	O	O	-	0	156	187	513	531 | ||||
| 0	ſtecke	O	O	-	0	206	250	513	531 | ||||
| 0	,	O	O	-	0	206	250	513	531 | ||||
| 0	das	O	O	-	0	268	296	513	529 | ||||
| 0	neue	O	O	-	0	316	349	516	529 | ||||
| 0	Miniſterium	O	O	-	0	367	463	511	529 | ||||
| 0	von	O	O	-	0	482	509	515	528 | ||||
| 0	dem	O	O	-	0	529	559	512	528 | ||||
| 0	bisher	O	O	-	0	566	632	511	530 | ||||
| 0	dominirenden	O	O	-	0	653	756	511	528 | ||||
| 0	Einfluß	O	O	-	0	156	216	531	550 | ||||
| 0	des	O	O	-	0	240	266	532	548 | ||||
| 0	Palais	B-LOC	O	4342820-4	0	293	346	530	550 | ||||
| 0	Bourbon	I-LOC	O	-	0	368	437	530	546 | ||||
| 0	frei	O	O	-	0	462	488	530	549 | ||||
| 0	zu	O	O	-	0	511	528	535	550 | ||||
| 0	machen	O	O	-	0	552	610	530	549 | ||||
| 0	.	O	O	-	0	552	610	530	549 | ||||
| 0		O	O	-	0	644	682	529	546 | ||||
| 0	Sein	O	O	-	0	644	682	529	546 | ||||
| 0	Beſuch	O	O	-	0	706	756	530	548 | ||||
| 0	bei	O	O	-	0	159	189	550	567 | ||||
| 0	Grévy	B-PER	O	119064693	0	195	246	551	569 | ||||
| 0	am	O	O	-	0	262	285	554	566 | ||||
| 0	Sonntag	O	O	-	0	300	368	550	569 | ||||
| 0	Morgen	O	O	-	0	380	442	549	569 | ||||
| 0	um	O	O	-	0	457	482	553	565 | ||||
| 0	10	O	O	-	0	496	514	550	565 | ||||
| 0	Uhr	O	O	-	0	525	546	549	568 | ||||
| 0	ſoll	O	O	-	0	546	593	549	569 | ||||
| 0	keineswegs	O	O	-	0	607	691	548	567 | ||||
| 0	erbeten	O	O	-	0	703	756	549	565 | ||||
| 0	ſondern	O	O	-	0	163	216	570	586 | ||||
| 0	—	O	O	-	0	225	243	577	580 | ||||
| 0	zum	O	O	-	0	254	285	573	587 | ||||
| 0	erſten	O	O	-	0	295	335	569	587 | ||||
| 0	Mal	O	O	-	0	345	386	567	587 | ||||
| 0	!	O	O	-	0	345	386	567	587 | ||||
| 0	—	O	O	-	0	396	414	576	578 | ||||
| 0	freiwillig	O	O	-	0	418	493	567	587 | ||||
| 0	und	O	O	-	0	508	537	568	584 | ||||
| 0	ziemlich	O	O	-	0	542	605	567	587 | ||||
| 0	unerwartet	O	O	-	0	615	697	568	583 | ||||
| 0	erfolgt	O	O	-	0	707	756	567	586 | ||||
| 0	ſein	O	O	-	0	156	190	586	606 | ||||
| 0	.	O	O	-	0	156	190	586	606 | ||||
| 0		O	O	-	0	209	237	588	604 | ||||
| 0	Was	O	O	-	0	209	237	588	604 | ||||
| 0	zwiſchen	O	O	-	0	238	317	587	606 | ||||
| 0	den	O	O	-	0	327	353	587	603 | ||||
| 0	beiden	O	O	-	0	362	408	587	603 | ||||
| 0	Präſidenten	O	O	-	0	418	508	586	606 | ||||
| 0	verhandelt	O	O	-	0	523	602	587	606 | ||||
| 0	worden	O	O	-	0	611	671	586	604 | ||||
| 0	,	O	O	-	0	611	671	586	604 | ||||
| 0	weiß	O	O	-	0	687	723	586	604 | ||||
| 0	na⸗	O	O	-	0	732	756	590	602 | ||||
| 0	türlich	O	O	-	0	157	205	606	624 | ||||
| 0	Niemand	O	O	-	0	217	289	607	624 | ||||
| 0	,	O	O	-	0	217	289	607	624 | ||||
| 0	wenn	O	O	-	0	300	339	609	623 | ||||
| 0	nicht	O	O	-	0	349	383	605	624 | ||||
| 0	Herr	O	O	-	0	393	429	606	624 | ||||
| 0	Gambetta	B-PER	O	118716263	0	434	509	606	622 | ||||
| 0	ſelbſt	O	O	-	0	519	557	604	624 | ||||
| 0	es	O	O	-	0	566	582	607	621 | ||||
| 0	hinterher	O	O	-	0	588	656	605	623 | ||||
| 0	beim	O	O	-	0	666	700	605	621 | ||||
| 0	Früh	O	O	-	0	710	756	604	624 | ||||
| 0	—	O	O	-	0	710	756	604	624 | ||||
| 0	ftück	O	O	-	0	157	189	625	643 | ||||
| 0	ſeinem	O	O	-	0	199	248	624	643 | ||||
| 0	Intimus	O	O	-	0	257	330	625	643 | ||||
| 0	,	O	O	-	0	257	330	625	643 | ||||
| 0	dem	O	O	-	0	339	370	624	640 | ||||
| 0	Schauſpieler	O	O	-	0	380	476	624	643 | ||||
| 0	Coquelin	B-PER	O	116670673	0	491	559	624	642 | ||||
| 0	dem	O	O	-	0	575	605	624	640 | ||||
| 0	„	O	O	-	0	620	714	623	642 | ||||
| 0	Jüngeren	O	O	-	0	620	714	623	642 | ||||
| 0	“	O	O	-	0	620	714	623	642 | ||||
| 0	von	O	O	-	0	728	756	626	639 | ||||
| 0	der	O	O	-	0	157	181	643	660 | ||||
| 0	Comédie	B-ORG	O	16295404-9	0	197	262	643	661 | ||||
| 0	françaiſe	I-ORG	O	-	0	277	345	642	661 | ||||
| 0	anvertraut	O	O	-	0	359	440	644	659 | ||||
| 0	hat	O	O	-	0	455	484	644	661 | ||||
| 0	.	O	O	-	0	455	484	644	661 | ||||
| 0		O	O	-	0	503	560	642	659 | ||||
| 0	Abends	O	O	-	0	503	560	642	659 | ||||
| 0	im	O	O	-	0	576	595	642	658 | ||||
| 0	Theater	O	O	-	0	604	665	642	661 | ||||
| 0	ſpielte	O	O	-	0	665	724	642	661 | ||||
| 0	der	O	O	-	0	733	756	642	658 | ||||
| 0	Allgewaltige	O	O	-	0	157	252	662	682 | ||||
| 0	freilich	O	O	-	0	262	312	662	681 | ||||
| 0	wieder	O	O	-	0	326	375	661	678 | ||||
| 0	den	O	O	-	0	389	415	662	678 | ||||
| 0	Unbefangenen	O	O	-	0	425	530	662	681 | ||||
| 0	und	O	O	-	0	544	572	661	677 | ||||
| 0	Ununterrichteten	O	O	-	0	582	711	661	679 | ||||
| 0	,	O	O	-	0	582	711	661	679 | ||||
| 0	denn	O	O	-	0	720	755	661	677 | ||||
| 0	er	O	O	-	0	158	172	686	697 | ||||
| 0	leugnete	O	O	-	0	182	242	682	700 | ||||
| 0	ſogar	O	O	-	0	256	296	681	699 | ||||
| 0	ſeinen	O	O	-	0	312	356	680	699 | ||||
| 0	Beſuch	O	O	-	0	366	416	681	699 | ||||
| 0	vom	O	O	-	0	433	465	683	696 | ||||
| 0	Vormittag	O	O	-	0	481	566	679	699 | ||||
| 0	,	O	O	-	0	481	566	679	699 | ||||
| 0	obwohl	O	O	-	0	583	638	681	698 | ||||
| 0	Hunderte	O	O	-	0	646	716	679	699 | ||||
| 0	das	O	O	-	0	728	755	679	695 | ||||
| 0	wohlbekannte	O	O	-	0	157	258	700	718 | ||||
| 0	kleine	O	O	-	0	271	312	698	715 | ||||
| 0	Coupé	O	O	-	0	322	371	699	718 | ||||
| 0	Gambettas	B-PER	O	119064693	0	382	466	698	716 | ||||
| 0	eine	O	O	-	0	482	510	699	715 | ||||
| 0	ganze	O	O	-	0	525	566	702	718 | ||||
| 0	Stunde	O	O	-	0	577	633	698	715 | ||||
| 0	lang	O	O	-	0	648	681	698	715 | ||||
| 0	von	O	O	-	0	695	712	701	714 | ||||
| 0	der	O	O	-	0	714	756	698	714 | ||||
| 0	Rue	B-LOC	O	-	0	157	189	718	735 | ||||
| 0	du	I-LOC	O	-	0	204	222	719	735 | ||||
| 0	Faubourg	I-LOC	O	-	0	232	308	718	736 | ||||
| 0	St	I-LOC	O	-	0	324	351	718	735 | ||||
| 0	.	I-LOC	O	-	0	324	351	718	735 | ||||
| 0	Honoré	I-LOC	O	-	0	360	418	718	736 | ||||
| 0	aus	O	O	-	0	434	462	720	733 | ||||
| 0	im	O	O	-	0	476	496	718	734 | ||||
| 0	Vorhof	O	O	-	0	505	562	717	735 | ||||
| 0	des	O	O	-	0	577	602	718	733 | ||||
| 0	Elyſée	B-LOC	O	4075880-1	0	612	661	717	736 | ||||
| 0	ſtationiren	O	O	-	0	666	755	703	736 | ||||
| 0	geſehen	O	O	-	0	158	211	737	756 | ||||
| 0	hatten	O	O	-	0	222	273	737	754 | ||||
| 0	.	O	O	-	0	222	273	737	754 | ||||
| 0		O	O	-	0	292	321	737	753 | ||||
| 0	Der	O	O	-	0	292	321	737	753 | ||||
| 0	Erfolg	O	O	-	0	331	382	736	756 | ||||
| 0	dieſer	O	O	-	0	392	432	736	755 | ||||
| 0	Viſite	O	O	-	0	437	480	736	754 | ||||
| 0	war	O	O	-	0	490	520	740	753 | ||||
| 0	denn	O	O	-	0	530	565	736	752 | ||||
| 0	auch	O	O	-	0	574	606	736	754 | ||||
| 0	ſchon	O	O	-	0	616	655	735	755 | ||||
| 0	in	O	O	-	0	665	679	735	752 | ||||
| 0	derſelben	O	O	-	0	689	756	736	754 | ||||
| 0	Zeit	O	O	-	0	157	189	756	775 | ||||
| 0	zu	O	O	-	0	198	214	760	775 | ||||
| 0	ſpüren	O	O	-	0	224	277	755	774 | ||||
| 0	,	O	O	-	0	224	277	755	774 | ||||
| 0	da	O	O	-	0	287	305	755	772 | ||||
| 0	derjenige	O	O	-	0	314	392	755	774 | ||||
| 0	,	O	O	-	0	314	392	755	774 | ||||
| 0	der	O	O	-	0	396	419	756	771 | ||||
| 0	ſie	O	O	-	0	429	445	755	774 | ||||
| 0	gemacht	O	O	-	0	455	519	756	774 | ||||
| 0	,	O	O	-	0	455	519	756	774 | ||||
| 0	ſie	O	O	-	0	533	550	754	773 | ||||
| 0	ableugnen	O	O	-	0	565	641	756	774 | ||||
| 0	wollte	O	O	-	0	651	702	754	770 | ||||
| 0	.	O	O	-	0	651	702	754	770 | ||||
| 0		O	O	-	0	720	756	754	774 | ||||
| 0	Herr	O	O	-	0	720	756	754	774 | ||||
| 0	Lepère	B-PER	O	1012607569	0	156	212	774	793 | ||||
| 0	,	O	O	-	0	156	212	774	793 | ||||
| 0	der	O	O	-	0	227	250	774	790 | ||||
| 0	bereits	O	O	-	0	264	314	774	790 | ||||
| 0	ſeine	O	O	-	0	331	374	773	792 | ||||
| 0	Siebenſachen	O	O	-	0	382	480	773	792 | ||||
| 0	zuſammengepackt	O	O	-	0	494	623	773	793 | ||||
| 0	hatte	O	O	-	0	638	679	773	791 | ||||
| 0	,	O	O	-	0	638	679	773	791 | ||||
| 0	weil	O	O	-	0	696	727	773	789 | ||||
| 0	er	O	O	-	0	743	756	777	789 | ||||
| 0	glaubte	O	O	-	0	157	211	793	811 | ||||
| 0	ausziehen	O	O	-	0	221	295	793	811 | ||||
| 0	zu	O	O	-	0	305	322	797	811 | ||||
| 0	müſſen	O	O	-	0	332	383	793	811 | ||||
| 0	—	O	O	-	0	393	412	801	803 | ||||
| 0	Freycinet	B-PER	O	118703099	0	421	493	793	811 | ||||
| 0	ſelbſt	O	O	-	0	496	544	792	811 | ||||
| 0	hatte	O	O	-	0	554	590	792	810 | ||||
| 0	ihm	O	O	-	0	600	629	793	809 | ||||
| 0	das	O	O	-	0	639	666	792	808 | ||||
| 0	zu	O	O	-	0	675	692	796	811 | ||||
| 0	wieder	O	O	-	0	702	756	791	808 | ||||
| 0	—	O	O	-	0	702	756	791	808 | ||||
| 0	holten	O	O	-	0	156	202	810	830 | ||||
| 0	Malen	O	O	-	0	212	262	811	828 | ||||
| 0	in	O	O	-	0	272	287	811	828 | ||||
| 0	dürren	O	O	-	0	297	347	812	827 | ||||
| 0	Worten	O	O	-	0	357	415	812	827 | ||||
| 0	geſagt	O	O	-	0	425	475	811	830 | ||||
| 0	—	O	O	-	0	484	503	819	822 | ||||
| 0	Herr	O	O	-	0	512	548	811	830 | ||||
| 0	Lepre	B-PER	O	1012607569	0	556	607	811	829 | ||||
| 0	erhielt	O	O	-	0	616	664	811	830 | ||||
| 0	von	O	O	-	0	674	701	814	826 | ||||
| 0	Gam	B-PER	O	118716263	0	711	755	811	827 | ||||
| 0	—	I-PER	O	-	0	711	755	811	827 | ||||
| 0	betta	I-PER	O	-	0	156	192	829	846 | ||||
| 0	die	O	O	-	0	202	224	830	846 | ||||
| 0	Nachricht	O	O	-	0	234	308	830	848 | ||||
| 0	,	O	O	-	0	234	308	830	848 | ||||
| 0	daß	O	O	-	0	318	346	830	848 | ||||
| 0	er	O	O	-	0	356	370	835	846 | ||||
| 0	bleiben	O	O	-	0	380	432	830	846 | ||||
| 0	dürfe	O	O	-	0	445	488	830	848 | ||||
| 0	.	O	O	-	0	445	488	830	848 | ||||
| 0		O	O	-	0	508	592	830	848 | ||||
| 0	Gleichzeitig	O	O	-	0	508	592	830	848 | ||||
| 0	wurde	O	O	-	0	602	649	829	845 | ||||
| 0	Herrn	O	O	-	0	658	703	829	848 | ||||
| 0	Waddington	B-PER	O	117086630	0	714	756	829	845 | ||||
| 0	das	O	O	-	0	230	257	849	865 | ||||
| 0	Gegentheil	O	O	-	0	272	354	848	867 | ||||
| 0	bedeutet	O	O	-	0	370	437	849	867 | ||||
| 0	;	O	O	-	0	370	437	849	867 | ||||
| 0	den	O	O	-	0	451	476	849	864 | ||||
| 0	Botſchafterpoſten	O	O	-	0	486	617	848	867 | ||||
| 0	in	O	O	-	0	633	648	848	864 | ||||
| 0	London	B-LOC	O	4074335-4	0	658	716	848	864 | ||||
| 0	,	O	O	-	0	658	716	848	864 | ||||
| 0	der	O	O	-	0	720	756	848	866 | ||||
| 0	ihm	O	O	-	0	156	185	866	885 | ||||
| 0	als	O	O	-	0	196	219	868	884 | ||||
| 0	Entſchädigung	O	O	-	0	230	339	867	886 | ||||
| 0	angeboten	O	O	-	0	350	426	868	886 | ||||
| 0	wurde	O	O	-	0	436	486	868	884 | ||||
| 0	,	O	O	-	0	436	486	868	884 | ||||
| 0	ſchlug	O	O	-	0	496	539	867	886 | ||||
| 0	er	O	O	-	0	549	563	872	883 | ||||
| 0	aus	O	O	-	0	573	605	869	883 | ||||
| 0	.	O	O	-	0	573	605	869	883 | ||||
| 0		O	O	-	0	625	648	866	882 | ||||
| 0	Von	O	O	-	0	625	648	866	882 | ||||
| 0	allen	O	O	-	0	649	699	868	882 | ||||
| 0	dieſen	O	O	-	0	699	756	866	884 | ||||
| 0	Vorgängen	O	O	-	0	159	244	885	905 | ||||
| 0	erhielt	O	O	-	0	254	305	886	904 | ||||
| 0	Léon	B-PER	O	117619744	0	310	350	885	902 | ||||
| 0	Say	I-PER	O	-	0	360	394	886	905 | ||||
| 0	erſt	O	O	-	0	407	432	886	902 | ||||
| 0	in	O	O	-	0	445	460	886	902 | ||||
| 0	ſpäter	O	O	-	0	475	519	886	903 | ||||
| 0	Nachmittagsſtunde	O	O	-	0	528	671	885	905 | ||||
| 0	Kenntniß	O	O	-	0	682	756	885	903 | ||||
| 0	.	O	O	-	0	682	756	885	903 | ||||
| 0		O	O	-	0	161	198	904	921 | ||||
| 0	Sein	O	O	-	0	161	198	904	921 | ||||
| 0	Entſchluß	O	O	-	0	208	281	904	923 | ||||
| 0	war	O	O	-	0	297	328	908	920 | ||||
| 0	ſofort	O	O	-	0	343	391	905	923 | ||||
| 0	gefaßt	O	O	-	0	400	451	903	923 | ||||
| 0	.	O	O	-	0	400	451	903	923 | ||||
| 0		O	O	-	0	471	519	905	923 | ||||
| 0	Gegen	O	O	-	0	471	519	905	923 | ||||
| 0	6	O	O	-	0	535	544	907	920 | ||||
| 0	Uhr	O	O	-	0	560	589	905	922 | ||||
| 0	Abends	O	O	-	0	599	656	904	920 | ||||
| 0	fuhr	O	O	-	0	666	690	904	922 | ||||
| 0	er	O	O	-	0	692	723	909	920 | ||||
| 0	ins	O	O	-	0	733	756	904	919 | ||||
| 0	Elyſée	B-LOC	O	4075880-1	0	158	207	923	942 | ||||
| 0	und	O	O	-	0	220	248	924	939 | ||||
| 0	legte	O	O	-	0	264	299	924	940 | ||||
| 0	ſein	O	O	-	0	313	340	923	940 | ||||
| 0	Portefeuille	O	O	-	0	355	445	923	942 | ||||
| 0	in	O	O	-	0	461	475	923	939 | ||||
| 0	Grevys	B-PER	O	119064693	0	490	546	923	942 | ||||
| 0	Hände	O	O	-	0	557	606	923	942 | ||||
| 0	zurück	O	O	-	0	621	671	923	942 | ||||
| 0	.	O	O	-	0	621	671	923	942 | ||||
| Can't render this file because it has a wrong number of fields in line 2. | 
							
								
								
									
										44
									
								
								neath.html
									
										
									
									
									
								
							
							
						
						
									
										44
									
								
								neath.html
									
										
									
									
									
								
							|  | @ -3,33 +3,35 @@ | |||
| <head> | ||||
|     <meta charset="UTF-8"> | ||||
|     <title>neath</title> | ||||
|     <base href="neath.html" target="_blank"> | ||||
|     <link rel="stylesheet" href="https://stackpath.bootstrapcdn.com/bootstrap/4.3.1/css/bootstrap.min.css" | ||||
|           integrity="sha384-ggOyR0iXCbMQv3Xipma34MD+dH/1fQ784/j6cY/iJTQUOhcWr7x9JvoRxT2MZw1T" crossorigin="anonymous"> | ||||
|     <script type="text/javascript" src="https://cdnjs.cloudflare.com/ajax/libs/PapaParse/5.0.1/papaparse.js"></script> | ||||
|     <script type="text/javascript" src="https://cdnjs.cloudflare.com/ajax/libs/PapaParse/5.1.0/papaparse.min.js"></script> | ||||
|     <script type="text/javascript" src="https://cdnjs.cloudflare.com/ajax/libs/keypress/2.1.5/keypress.min.js"></script> | ||||
|     <script type="text/javascript" src="https://cdnjs.cloudflare.com/ajax/libs/jquery/3.4.1/jquery.min.js"></script> | ||||
|     <script type="text/javascript" src="http://code.jquery.com/ui/1.12.1/jquery-ui.min.js"></script> | ||||
|     <script type="text/javascript" src="https://cdnjs.cloudflare.com/ajax/libs/jqueryui/1.12.1/jquery-ui.min.js"></script> | ||||
|     <style> | ||||
|         body{font-family:Verdana;font-size:16px} | ||||
|         table{table-layout:fixed;width:100%;text-align:center} | ||||
|         th{background-color:lightgray} | ||||
|         .editable:hover{background-color:yellow} | ||||
|         tr:hover{background-color:whitesmoke} | ||||
|         .editable:focus{background-color:#f0e442} | ||||
|         tr:focus-within{background-color:#dddddd} | ||||
| 
 | ||||
|         .accordion:hover .accordion-item:hover .accordion-item-content, | ||||
| 		.accordion .accordion-item--default .accordion-item-content{height:9em;} | ||||
| 		.accordion-item-content, .accordion:hover .accordion-item-content{height:0;overflow:hidden;transition:height.25s;} | ||||
| 		.accordion{padding:0;margin:0auto;width:100px;} | ||||
| 		.accordion-item:hover{background-color:yellow;} | ||||
| 		.accordion .accordion-item--default .accordion-item-content{height:10.5em} | ||||
| 		.accordion-item-content, .accordion:hover .accordion-item-content{height:0;overflow:hidden;transition:height.25s} | ||||
| 		.accordion{padding:0;margin:auto;width:100px} | ||||
| 		.accordion-item:hover{background-color:#f0e442} | ||||
| 
 | ||||
|         .type_select:hover{background-color:yellow;} | ||||
|         .type_select:hover{background-color:#f0e442} | ||||
| 
 | ||||
|         .ner_per{background-color:skyblue} | ||||
|         .ner_loc{background-color:goldenrod} | ||||
|         .ner_org{background-color:plum} | ||||
|         .ner_pub{background-color:lightgreen} | ||||
|         .ner_conf{background-color:olive} | ||||
|         .ner_art{background-color:lavender} | ||||
|         .ner_todo{background-color:turquoise} | ||||
|         .ner_per{background-color:#56b3e9} | ||||
|         .ner_loc{background-color:#e69d00} | ||||
|         .ner_org{background-color:#df6caa} | ||||
|         .ner_work{background-color:#009e74} | ||||
|         .ner_conf{background-color:#0072b2} | ||||
|         .ner_evt{background-color:#a60a2d} | ||||
|         .ner_todo{background-color:#d55e00} | ||||
| 
 | ||||
|         .fit-image{ | ||||
|             width: 100%; | ||||
|  | @ -53,8 +55,8 @@ | |||
|         <div class="col-9"> | ||||
|             <div class="row"> | ||||
|                 <div class="col text-center"> | ||||
|                     <h3><a href="https://github.com/qurator-spk/neath" target="_blank">neath</a>: named entity annotation tool in html</h3> | ||||
|                     <a href="https://github.com/qurator-spk/neath/blob/master/docs/User_Guide.md" target="_blank">User Guide</a> | <a href="https://github.com/qurator-spk/neath/blob/master/docs/Annotation_Guidelines.md" target="_blank">Annotation Guidelines</a> | <a href="https://github.com/qurator-spk/neath/issues" target="_blank">Issues</a><hr> | ||||
|                     <h3><a href="https://github.com/qurator-spk/neath" target="_blank" tabindex="-1">neath</a>: named entity annotation tool</h3> | ||||
|                     <a href="https://github.com/qurator-spk/neath/blob/master/README.md#2-user-guide" target="_blank"  tabindex="-1">User Guide</a> | <a href="https://github.com/qurator-spk/neath/blob/master/Annotation_Guidelines.pdf" target="_blank" tabindex="-1">Annotation Guidelines</a> | <a href="https://github.com/qurator-spk/neath/issues" target="_blank" tabindex="-1">Issues</a><hr> | ||||
|                 </div> | ||||
|             </div> | ||||
|         </div> | ||||
|  | @ -62,13 +64,13 @@ | |||
|         </div> | ||||
|     </div> | ||||
|     <div class="row mt-3"> | ||||
|         <div class="col-2" id="region-left"> | ||||
|         <div class="col-3" id="region-left"> | ||||
|             <a href="" id="preview-link"> | ||||
|                 <img id="preview" alt="facsimile_preview" class="img-responsive fit-image"/> | ||||
|             </a> | ||||
|         </div> | ||||
|         <div class="col-9 text-center" id="tableregion"> | ||||
|             Please upload a TSV file: | ||||
|         <div class="col-8 text-center" id="tableregion"> | ||||
|             Please upload a TSV<sup>(<a href="https://github.com/qurator-spk/neath/blob/master/User_Guide.md#22-data-format">i</a>)</sup> file: | ||||
|             <br><br> | ||||
|             <input type="file" id="tsv-file" name="files"/> | ||||
|         </div> | ||||
|  |  | |||
		Loading…
	
	Add table
		Add a link
		
	
		Reference in a new issue