19 Commits (67e7f9e0f10493955f567f13148fb3b951a57619)

Author SHA1 Message Date
cneud 98e48dec38 typo 5 years ago
Clemens Neudecker 409d7db2f2
Transform OCR coordinates for web presentation images (fixes #31)
thx @kba!

(scaling factor will require testing with more images though)
5 years ago
Kai Labusch 4cb0a53434 turn --noproxy option into flag 5 years ago
Kai Labusch 8d79c67478 remove breakpoint 5 years ago
Kai Labusch ee07c0cf7c remove superfluous parameter 5 years ago
Kai Labusch 137fff5655 add word/sentence tokenization and NER pre-processing 5 years ago
cneud 1ce93382fe update setup.py 5 years ago
Kai Labusch 117b649120 fix README 5 years ago
Kai Labusch bcfb8220d3 notify user on page reload 5 years ago
Kai Labusch b82b8175a9 fix readme 5 years ago
Kai Labusch 22e97da8de remove url mapping file 5 years ago
Kai Labusch daa9a2676e fix wrong computation of boundaries 5 years ago
Kai Labusch 692e990fba improve html layout; add reasonable default for --image-url option 5 years ago
Kai Labusch d6311edd0c improve page2tsv tool 5 years ago
Clemens Neudecker f2e9ed535d
Update README.md 5 years ago
cneud d4199d0ddd add Python script for converting PAGE-XML to TSV 5 years ago
Kai Labusch 01c080387e add tools README 5 years ago
Kai Labusch 450886cda6 add image preview 5 years ago
Kai Labusch 6afb0a6375 add annotation tools and url mapping integration 5 years ago