Introduces version pinning via pip-tools for reproducible builds.
Introduces license checking via pip-licenses and CircleCI.
Licenses are only checked for new builds and when requirements.txt changes.
The list of allowed licenses is kept in a separate file to be able to
distinguish changes in CI-Configuration/Tools and license list changes.
Dinglehopper did not consider `OrderedGroupIndex` in the `ReadingOrder`
element when extracting text regions. As a consequence a `TableRegion`
was not considered for text extraction.
Sharing even parts of the .idea folder in worldwide setting is bound to
generate more problems than solutions. Therefore it should be removed
and consequently ignore in .gitignore.
Also adds some Python specific stuff to the .gitignore file.
When extracting text from TextEquiv nodes we may encounter nodes without
index or nodes that should get sorted via the conf attribute.
Therefore we added a more complex algorithm to extract a TextEquiv and
inform the user via log messages if we encounter structures that we can
handle but may produce unexpected results.
Python's `sorted` method will fail with a TypeError when called with
`None` and Integers:
```python
>>> sorted([None, 1])
TypeError: '<' not supported between instances of 'int' and 'NoneType'
```
Therefore we are using `float('inf')` instead of `None` in case of
missing textline indices.
Add a proposal for a .editorconfig file (see https://editorconfig.org/).
This is natively supported by a lot of editors, others are supported via
plugins.
This will close#19.