dinglehopper

mirror of https://github.com/qurator-spk/dinglehopper.git synced 2025-10-31 09:24:15 +01:00

Author	SHA1	Message	Date
Benjamin Rosemann	750ad00d1b	Add tooltips to fca report	2021-02-16 11:28:23 +01:00
Benjamin Rosemann	53064bf833	Include fca as parameter and add some tests	2021-02-16 11:28:23 +01:00
Benjamin Rosemann	9b76539936	Fix numpy version conflict with ocrd_utils	2021-02-16 11:28:23 +01:00
Benjamin Rosemann	26fe98dde7	Readd pytest.ini	2021-02-16 11:28:23 +01:00
Benjamin Rosemann	4a87adc2c7	Implement version specific data structures As ocr-d continues the support for Python 3.5 until the end of this year version specific data structures have been implemented. When the support for Python 3.5 is dropped the extra file can easily be removed.	2021-02-16 11:28:23 +01:00
Benjamin Rosemann	2a215a1062	Reformat using black	2021-02-16 11:28:23 +01:00
Benjamin Rosemann	5277593bdb	Fix some special cases	2021-02-16 11:28:23 +01:00
Benjamin Rosemann	d7a74fa58b	First draft of flexible character accuracy	2021-02-16 11:28:23 +01:00
Gerber, Mike	bd324331e6	🚧 dinglehopper: Try out Drone CI All checks were successful continuous-integration/drone/push Build is passing Details	2021-02-11 14:26:29 +01:00
Gerber, Mike	a59ecb795c	🚧 dinglehopper: Try out Drone CI Some checks failed continuous-integration/drone/push Build is failing Details	2021-02-11 14:15:08 +01:00
Gerber, Mike	14230e073a	🚧 dinglehopper: Try out Drone CI	2021-02-11 14:08:25 +01:00
Gerber, Mike	985666a71c	🚧 dinglehopper: Try out Drone CI	2021-02-10 20:35:22 +01:00
Gerber, Mike	4a73053cfc	🚧 Replace Travis with CircleCI	2021-02-10 18:22:52 +01:00
Gerber, Mike	e3d4493c82	🚧 Replace Travis with CircleCI	2021-02-10 17:58:58 +01:00
Gerber, Mike	27f4c3bdf8	🚧 Replace Travis with CircleCI	2021-02-10 17:57:08 +01:00
Gerber, Mike	8533e6d421	🚧 Replace Travis with CircleCI	2021-02-10 17:55:09 +01:00
Gerber, Mike	e8da8b63f8	🚧 Replace Travis with CircleCI	2021-02-10 17:53:50 +01:00
Gerber, Mike	3b7a1a5631	🚧 Replace Travis with CircleCI	2021-02-10 17:50:34 +01:00
Mike Gerber	691ce371ca	Merge pull request #50 from b2m/fix-table-extraction Fix the extraction of text from Page with TableRegion	2021-02-01 17:51:33 +01:00
Benjamin Rosemann	a68fc269d9	Fix the extraction of text from Page with TableRegion Dinglehopper did not consider `OrderedGroupIndex` in the `ReadingOrder` element when extracting text regions. As a consequence a `TableRegion` was not considered for text extraction.	2020-11-27 11:18:11 +01:00
Gerber, Mike	8cd8314c8a	🐛 dinglehopper: Bump up ocrd req for zip_input_files See also GH-49.	2020-11-19 18:59:47 +01:00
Mike Gerber	62670dd0c7	Merge pull request #49 from kba/zip_input_files ocrd cli: use core-provided zip_input_files method	2020-11-19 18:54:21 +01:00
Konstantin Baierer	74e0ac18ed	ocrd cli: use core-provided zip_input_files method	2020-11-19 16:00:28 +01:00
Gerber, Mike	389e253c11	🐛 dinglehopper: Fix alto_extract_lines()'s type annotation	2020-11-12 19:32:38 +01:00
Gerber, Mike	fe3923a8af	🐛 dinglehopper: Fix alto_extract()'s type annotation	2020-11-12 19:19:05 +01:00
Gerber, Mike	132f91d500	✔️ dinglehopper: Add missing integration test markers	2020-11-12 19:10:23 +01:00
Gerber, Mike	c48d7646df	📝 dinglehopper: README-DEV: Massage markdown a bit	2020-11-12 19:05:14 +01:00
Mike Gerber	fed021090d	Merge pull request #46 from b2m/tool-changes Tool changes	2020-11-12 18:59:25 +01:00
Benjamin Rosemann	cb1ac9d260	Add black to developer requirements.	2020-11-11 11:36:17 +01:00
Benjamin Rosemann	03ad413f4a	Added some helpful tools and configurations	2020-11-11 11:36:17 +01:00
Benjamin Rosemann	5cbd4f3d95	Preparation for black code formatter	2020-11-11 11:36:17 +01:00
Benjamin Rosemann	ce752e1912	Remove .idea folder and modify .gitignore Sharing even parts of the .idea folder in worldwide setting is bound to generate more problems than solutions. Therefore it should be removed and consequently ignore in .gitignore. Also adds some Python specific stuff to the .gitignore file.	2020-11-11 11:36:17 +01:00
Benjamin Rosemann	5270737c1f	Skip test on windows because it is unix specific.	2020-11-11 11:36:17 +01:00
Gerber, Mike	32a4b95a99	🐛 dinglehopper: Normalize in plain_extract()	2020-11-10 18:51:14 +01:00
Gerber, Mike	14421c8e53	🎨 dinglehopper: Reformat using black	2020-11-10 12:29:55 +01:00
Gerber, Mike	31c63f9e4c	🎨 dinglehopper: s/LOG/log	2020-11-09 16:55:43 +01:00
Mike Gerber	0804b029c4	Merge pull request #43 from bertsky/patch-1 1 more update for core's getLogger context	2020-11-09 16:51:00 +01:00
Robert Sachunsky	a60c14351e	1 more update for core's getLogger context	2020-11-03 17:46:59 +01:00
Mike Gerber	a51f0b3dcd	Merge pull request #42 from b2m/test-python-cache-for-travis Add travis pip caching	2020-10-30 12:35:20 +01:00
Benjamin Rosemann	b10af9f138	Test travis pip caching	2020-10-29 16:41:19 +01:00
Mike Gerber	089f6d299e	Merge pull request #37 from b2m/fix-sort-with-none Sort textlines with missing indices	2020-10-29 15:05:46 +01:00
Mike Gerber	5138a1de21	Merge pull request #39 from b2m/test-python-3.9 Add Python 3.9 to .travis.yml	2020-10-29 13:42:24 +01:00
Benjamin Rosemann	c02569b41e	Fix f-strings for Python 3.5	2020-10-29 12:33:54 +01:00
Benjamin Rosemann	7b27b2834e	More complex sorting for text extraction When extracting text from TextEquiv nodes we may encounter nodes without index or nodes that should get sorted via the conf attribute. Therefore we added a more complex algorithm to extract a TextEquiv and inform the user via log messages if we encounter structures that we can handle but may produce unexpected results.	2020-10-29 10:03:40 +01:00
Benjamin Rosemann	6ff831dfd2	Sort textlines with missing indices Python's `sorted` method will fail with a TypeError when called with `None` and Integers: ```python >>> sorted([None, 1]) TypeError: '<' not supported between instances of 'int' and 'NoneType' ``` Therefore we are using `float('inf')` instead of `None` in case of missing textline indices.	2020-10-29 10:03:40 +01:00
Benjamin Rosemann	e77f19fefc	Add Python 3.9 to .travis.yml	2020-10-29 10:02:51 +01:00
Mike Gerber	082fc9e09a	Merge pull request #38 from b2m/add-editorconfig Add .editorconfig	2020-10-28 15:16:04 +01:00
Benjamin Rosemann	20661487d6	Add .editorconfig Add a proposal for a .editorconfig file (see https://editorconfig.org/). This is natively supported by a lot of editors, others are supported via plugins. This will close #19.	2020-10-28 11:31:18 +01:00
Gerber, Mike	6e47acda1c	📝 dinglehopper: Move screenshot higher	2020-10-21 19:31:53 +02:00
Gerber, Mike	5cbe148741	🐛 dinglehopper: Skip pages if there is no GT nor OCR (Fixes GH-34)	2020-10-21 19:29:45 +02:00

1 2 3 4 5

244 commits