From 0a728c2ce148dd81ce3563582da1e413aa759420 Mon Sep 17 00:00:00 2001 From: "dependabot[bot]" <49699333+dependabot[bot]@users.noreply.github.com> Date: Tue, 28 Jan 2020 22:20:04 +0000 Subject: [PATCH 1/6] build(deps): bump tensorflow-gpu from 1.13.1 to 1.15.2 Bumps [tensorflow-gpu](https://github.com/tensorflow/tensorflow) from 1.13.1 to 1.15.2. - [Release notes](https://github.com/tensorflow/tensorflow/releases) - [Changelog](https://github.com/tensorflow/tensorflow/blob/master/RELEASE.md) - [Commits](https://github.com/tensorflow/tensorflow/compare/v1.13.1...v1.15.2) Signed-off-by: dependabot[bot] --- requirements.txt | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/requirements.txt b/requirements.txt index 21daa14..141da39 100644 --- a/requirements.txt +++ b/requirements.txt @@ -1,2 +1,2 @@ calamari-ocr==0.3.5 -tensorflow-gpu==1.13.1 +tensorflow-gpu==1.15.2 From eedcb74ab8a872954e046dcf67a7bccf4a0709e2 Mon Sep 17 00:00:00 2001 From: Mike Gerber Date: Wed, 12 Feb 2020 15:55:36 +0100 Subject: [PATCH 2/6] =?UTF-8?q?=F0=9F=93=9D=20Fix=20model=20download=20lin?= =?UTF-8?q?k?= MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Fixes #5. --- README.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/README.md b/README.md index 37becd0..0648fc3 100644 --- a/README.md +++ b/README.md @@ -12,4 +12,4 @@ is not available, that is if you're not a member of the Qurator team at SBB. Trained models -------------- For a finished model have a look here: -https://file.spk-berlin.de:8443/calamari-models/ +https://qurator-data.de/calamari-models/ From 9cd5981ae2348e88ca3d227e13d9fda1a2f35364 Mon Sep 17 00:00:00 2001 From: Mike Gerber Date: Wed, 12 Feb 2020 15:59:35 +0100 Subject: [PATCH 3/6] =?UTF-8?q?=E2=AC=86=EF=B8=8F=20Make=20tensorflow-gpu?= =?UTF-8?q?=20dependency=20less=20tight?= MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit --- requirements.txt | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/requirements.txt b/requirements.txt index 141da39..9bdca1e 100644 --- a/requirements.txt +++ b/requirements.txt @@ -1,2 +1,2 @@ calamari-ocr==0.3.5 -tensorflow-gpu==1.15.2 +tensorflow-gpu==1.15.* From f5a3a4fb2eb7d7484eaca8b840d2ad5b9fc84951 Mon Sep 17 00:00:00 2001 From: Mike Gerber Date: Thu, 13 Feb 2020 11:04:58 +0100 Subject: [PATCH 4/6] =?UTF-8?q?=F0=9F=93=9D=20README:=20Update=20to=20refl?= =?UTF-8?q?ect=20that=20this=20is=20mainly=20for=20documentation=20of=20th?= =?UTF-8?q?e=20trained=20model?= MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit --- README.md | 19 +++++++++++++------ 1 file changed, 13 insertions(+), 6 deletions(-) diff --git a/README.md b/README.md index 0648fc3..a644854 100644 --- a/README.md +++ b/README.md @@ -2,14 +2,21 @@ Train a GT4HistOCR Calamari model ================================= `train.sh` trains a Calamari model based on GT4HistOCR. Or rather 5 using -cross-validation to use for confidence voting. - -Requires Calamari 0.3.5. - -`train.sh` is able to download GT4HistOCR from the web if the `data` submodule -is not available, that is if you're not a member of the Qurator team at SBB. +cross-validation to use for confidence voting. This repository mainly +serves as documentation of the providence of the model published at +https://qurator-data.de/calamari-models/, not as the definitive guide to +training such a model. Trained models -------------- For a finished model have a look here: https://qurator-data.de/calamari-models/ + +Training your own model +----------------------- +If you really want to, you can use this script to train your own. It takes +about 1 week on a Nvidia RTX 2080 GPU. Please use [requirements.txt](requirements.txt) +in that case to setup a virtualenv. + +`train.sh` is able to download GT4HistOCR from the web if the `data` submodule +is not available, that is if you're not a member of the Qurator team at SBB. From e2a3f7e10c26a2cd502abc180f96e120ab11a530 Mon Sep 17 00:00:00 2001 From: Mike Gerber Date: Thu, 13 Feb 2020 11:18:57 +0100 Subject: [PATCH 5/6] =?UTF-8?q?=F0=9F=93=9D=20It's=20provenance=20not=20pr?= =?UTF-8?q?ovidence=20=F0=9F=98=80?= MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit --- README.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/README.md b/README.md index a644854..eed4431 100644 --- a/README.md +++ b/README.md @@ -3,7 +3,7 @@ Train a GT4HistOCR Calamari model `train.sh` trains a Calamari model based on GT4HistOCR. Or rather 5 using cross-validation to use for confidence voting. This repository mainly -serves as documentation of the providence of the model published at +serves as documentation of the provenance of the model published at https://qurator-data.de/calamari-models/, not as the definitive guide to training such a model. From a8848c939b4abfca1582fb4da676a91361f88f00 Mon Sep 17 00:00:00 2001 From: Mike Gerber Date: Thu, 13 Feb 2020 13:03:38 +0100 Subject: [PATCH 6/6] =?UTF-8?q?=F0=9F=93=9D=20README=20should=20still=20me?= =?UTF-8?q?ntion=20Calamari=20version?= MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit --- README.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/README.md b/README.md index eed4431..6510d90 100644 --- a/README.md +++ b/README.md @@ -1,7 +1,7 @@ Train a GT4HistOCR Calamari model ================================= -`train.sh` trains a Calamari model based on GT4HistOCR. Or rather 5 using +`train.sh` trains a Calamari 0.3.5 model based on GT4HistOCR. Or rather 5 using cross-validation to use for confidence voting. This repository mainly serves as documentation of the provenance of the model published at https://qurator-data.de/calamari-models/, not as the definitive guide to