37edc0336f
🧹 dinglehopper: Remove obsolete XXX that has a GitHub issue
2020-06-18 13:27:59 +02:00
9f05e6ca4c
🧹 dinglehopper: Remove obsolete XXX about None ids
2020-06-18 13:27:59 +02:00
4469af62c8
🎨 dinglehopper: Unfuck substitutions a bit
2020-06-18 13:27:59 +02:00
079be203bd
🐛 dinglehopper: Fix tests to deal with new normalization logic
2020-06-18 13:27:59 +02:00
c010a7f05e
🧹 dinglehopper: Calculate segment ids once, on the first call
2020-06-18 13:27:59 +02:00
0cf7ff4721
🧹 dinglehopper: Remove obsolete XXX about the PAGE hierarchy
2020-06-18 13:27:59 +02:00
c432cb505a
🧹 dinglehopper: Clean up test_lines_similar()
2020-06-18 13:27:59 +02:00
0c33e84415
📓 dinglehopper: Document editops()
2020-06-18 13:27:59 +02:00
a61c935624
🧹 dinglehopper: Move Python 3.5 XXXs to a GitHub issue
...
See https://github.com/qurator-spk/dinglehopper/issues/20 .
2020-06-18 13:27:59 +02:00
257e4986cc
🚧 dinglehopper: Use a Bootstrap tooltip for the segment id
2020-06-18 13:27:59 +02:00
a320d5fd8f
🚧 dinglehopper: Re-introduce "substitute_equivalences" as Normalization.NFC_SBB
2020-06-18 13:27:59 +02:00
2579e0220c
🚧 dinglehopper: Remove debug output
2020-06-18 13:27:59 +02:00
d4e39d3d26
🚧 dinglehopper: Display segment id in the corresponding column
2020-06-18 13:27:59 +02:00
48ad340428
🚧 dinglehopper: Display segment id when hovering over a character difference
2020-06-18 13:27:59 +02:00
1f6538b44c
🚧 dinglehopper: Extract text while retaining segment id info
2020-06-18 13:27:59 +02:00
275ff32524
🚧 dinglehopper: Extract text while retaining segment id info
2020-06-18 13:27:59 +02:00
4e182e0794
🚧 dinglehopper: Extract text while retaining segment id info
2020-06-18 13:27:59 +02:00
9f8bb1d8ea
🚧 dinglehopper: Extract text while retaining segment id info
2020-06-18 13:27:59 +02:00
668de758a0
✨ dinglehopper: Support disabling metrics in the OCR-D interface
2020-06-09 18:29:59 +02:00
f699697eb3
🐛 dinglehopper: Fix reading OCR-D workspace files when only URLs are provided
2020-06-09 17:13:22 +02:00
22765f02a2
🐛 dinglehopper: Fix tests by making metrics a keyword argument
2020-06-09 13:07:44 +02:00
5cbeb7b0dd
✨ dinglehopper: Support disabling the metrics using CLI option --no-metrics
2020-06-08 18:26:21 +02:00
745095e52c
✨ dinglehopper: Include number of characters and words in JSON report
2020-02-21 14:53:16 +01:00
48a31ce672
Revert "Merge branch 'master' of https://github.com/qurator-spk/sbb_textline_detector "
...
This reverts commit 2c89bf3b35ee290d7b830ef270df3a96aa48245e, reversing
changes made to 9f7e413148ca5dbac9b555d7b0d0a5fa3a0f5340.
2019-12-09 12:44:05 +01:00
b-vr103
1303a7d92f
Merge branch 'master' of https://github.com/qurator-spk/sbb_textline_detector
2019-12-09 11:57:16 +01:00
f32eb9eb69
🐛 dinglehopper: Escape text inserted into HTML ( Fixes #8 )
2019-12-06 15:59:09 +01:00
82e863fac2
📝 dinglehopper: Document seq_editops()
2019-12-04 13:22:38 +01:00
5ccdace1dd
🎨 dinglehopper: Move working_directory() context manager into tests/util
2019-12-02 15:18:08 +01:00
f98c527c93
🐛 dinglehopper: Fix working_directory() context manager
2019-12-02 15:14:16 +01:00
5273d10bac
🐛 dinglehopper: Generate a loadable JSON report even if CER=∞
2019-12-02 15:00:07 +01:00
ced6504ad0
🎨 dinglehopper: Expose clearing the Levenshtein cache as a function
2019-11-20 13:24:45 +01:00
5cf4eddaeb
⚡ dinglehopper: Clear Levenshtein cache between OCR-D files
2019-11-20 13:05:45 +01:00
58ff140bc0
⚡ ️ dinglehopper: Improve performance by caching the Levensthein matrix
...
Motivated by [a pull
request](https://github.com/qurator-spk/dinglehopper/pull/7 ) by
@JKamlah, implement a cache of the Levensthein matrix calculation.
We calculated the Levenshtein matrixes for characters and words twice:
Once for the error rates, once for the alignment.
2019-11-18 15:33:17 +01:00
11a6341641
🧹 dinglehopper: Remove broken implementation of the unordered word error rate
2019-11-18 15:03:17 +01:00
f22228840e
🧹 dinglehopper: Use exclusively relative imports in tests
2019-11-18 14:31:43 +01:00
d61c076aad
🧹 dinglehopper: Remove debug print()s
2019-11-18 13:15:43 +01:00
12a48f3bfe
✅ dinglehopper: Test aligning lists of lines
2019-11-18 13:00:40 +01:00
680c2a2661
🐛 dinglehopper: Fix test_ocrd_cli for Python 3.5, again, and again
2019-10-28 15:05:08 +01:00
7cf1a540f4
🐛 dinglehopper: Fix test_ocrd_cli for Python 3.5, again
2019-10-28 14:58:24 +01:00
49e2065ad6
🐛 dinglehopper: Fix test_ocrd_cli for Python 3.5
2019-10-28 14:51:41 +01:00
86178271df
✅ dinglehopper: Fix repeated tests for the OCR-D interface
2019-10-28 11:47:42 +01:00
b6f50ef853
✅ dinglehopper: Add a test for the OCR-D interface
2019-10-25 19:00:39 +02:00
Konstantin Baierer
2ca44af31d
ocrd-tool: add category
2019-10-18 16:43:03 +02:00
c30553985f
� dinglehopper: Substitute more characters
2019-10-01 13:18:12 +02:00
493541fddf
🐛 dinglehopper: Always work with NFC text
2019-10-01 12:35:44 +02:00
df93c80e5d
🐛 dinglehopper: Always work with NFC text
2019-10-01 11:28:14 +02:00
715b813bbc
� dinglehopper: Add two more eMOP ligatures
2019-10-01 10:53:20 +02:00
8d055e7b6e
🐛 dinglehopper: Work on NFC'ed grapheme clusters when aligning text
2019-09-30 18:17:13 +02:00
534958be1d
🐛 dinglehopper: Fix sorting the reading order
...
Regions were sorted wrongly when there are more than 9 regions in an
OrderedGroup because the index was sorted alphabetically, not
numerically. Fix this by converting the index to integers.
2019-09-30 16:06:59 +02:00
10f010eaa8
🐛 dinglehopper: Do not throw error if a region ID is not found
...
The ReadingOrder might contain regions of types other than text regions,
so not finding a TextRegion with the referenced ID is not an error.
Downgrade to a warning for now.
2019-09-26 15:19:30 +02:00