Commit graph

106 commits

Author SHA1 Message Date
johnlockejrr
f298643fcf
Fix ReduceONPlateau wrong logic
# Training Script Improvements

## Learning Rate Management Fixes

### 1. ReduceLROnPlateau Implementation
- Fixed the learning rate reduction mechanism by replacing the manual epoch loop with a single `model.fit()` call
- This ensures proper tracking of validation metrics across epochs
- Configured with:
  ```python
  reduce_lr = ReduceLROnPlateau(
      monitor='val_loss',
      factor=0.2,        # More aggressive reduction
      patience=3,        # Quick response to plateaus
      min_lr=1e-6,       # Minimum learning rate
      min_delta=1e-5,    # Minimum change to be considered improvement
      verbose=1
  )
  ```

### 2. Warmup Implementation
- Added learning rate warmup using TensorFlow's native scheduling
- Gradually increases learning rate from 1e-6 to target (2e-5) over 5 epochs
- Helps stabilize initial training phase
- Implemented using `PolynomialDecay` schedule:
  ```python
  lr_schedule = tf.keras.optimizers.schedules.PolynomialDecay(
      initial_learning_rate=warmup_start_lr,
      decay_steps=warmup_epochs * steps_per_epoch,
      end_learning_rate=learning_rate,
      power=1.0  # Linear decay
  )
  ```

### 3. Early Stopping
- Added early stopping to prevent overfitting
- Configured with:
  ```python
  early_stopping = EarlyStopping(
      monitor='val_loss',
      patience=6,
      restore_best_weights=True,
      verbose=1
  )
  ```

## Model Saving Improvements

### 1. Epoch-based Model Saving
- Implemented custom `ModelCheckpointWithConfig` to save both model and config
- Saves after each epoch with corresponding config.json
- Maintains compatibility with original script's saving behavior

### 2. Best Model Saving
- Saves the best model at training end
- If early stopping triggers: saves the best model from training
- If no early stopping: saves the final model

## Configuration
All parameters are configurable through the JSON config file:
```json
{
    "reduce_lr_enabled": true,
    "reduce_lr_monitor": "val_loss",
    "reduce_lr_factor": 0.2,
    "reduce_lr_patience": 3,
    "reduce_lr_min_lr": 1e-6,
    "reduce_lr_min_delta": 1e-5,
    "early_stopping_enabled": true,
    "early_stopping_monitor": "val_loss",
    "early_stopping_patience": 6,
    "early_stopping_restore_best_weights": true,
    "warmup_enabled": true,
    "warmup_epochs": 5,
    "warmup_start_lr": 1e-6
}
```

## Benefits
1. More stable training with proper learning rate management
2. Better handling of training plateaus
3. Automatic saving of best model
4. Maintained compatibility with existing config saving
5. Improved training monitoring and control
2025-05-17 23:24:40 +03:00
johnlockejrr
7661080899
LR Warmup and Optimization Implementation
# Learning Rate Warmup and Optimization Implementation

## Overview
Added learning rate warmup functionality to improve training stability, especially when using pretrained weights. The implementation uses TensorFlow's native learning rate scheduling for better performance.

## Changes Made

### 1. Configuration Updates (`runs/train_no_patches_448x448.json`)
Added new configuration parameters for warmup:
```json
{
    "warmup_enabled": true,
    "warmup_epochs": 5,
    "warmup_start_lr": 1e-6
}
```

### 2. Training Script Updates (`train.py`)

#### A. Optimizer and Learning Rate Schedule
- Replaced fixed learning rate with dynamic scheduling
- Implemented warmup using `tf.keras.optimizers.schedules.PolynomialDecay`
- Maintained compatibility with existing ReduceLROnPlateau and EarlyStopping

```python
if warmup_enabled:
    lr_schedule = tf.keras.optimizers.schedules.PolynomialDecay(
        initial_learning_rate=warmup_start_lr,
        decay_steps=warmup_epochs * steps_per_epoch,
        end_learning_rate=learning_rate,
        power=1.0  # Linear decay
    )
    optimizer = Adam(learning_rate=lr_schedule)
else:
    optimizer = Adam(learning_rate=learning_rate)
```

#### B. Learning Rate Behavior
- Initial learning rate: 1e-6 (configurable via `warmup_start_lr`)
- Target learning rate: 5e-5 (configurable via `learning_rate`)
- Linear increase over 5 epochs (configurable via `warmup_epochs`)
- After warmup, learning rate remains at target value until ReduceLROnPlateau triggers

## Benefits
1. Improved training stability during initial epochs
2. Better handling of pretrained weights
3. Efficient implementation using TensorFlow's native scheduling
4. Configurable through JSON configuration file
5. Maintains compatibility with existing callbacks (ReduceLROnPlateau, EarlyStopping)

## Usage
To enable warmup:
1. Set `warmup_enabled: true` in the configuration file
2. Adjust `warmup_epochs` and `warmup_start_lr` as needed
3. The warmup will automatically integrate with existing learning rate reduction and early stopping

To disable warmup:
- Set `warmup_enabled: false` or remove the warmup parameters from the configuration file
2025-05-17 16:17:38 +03:00
johnlockejrr
1bf801985b
Update gt_gen_utils.py
Keep safely the full basename without extension
2025-05-14 03:34:51 -07:00
johnlockejrr
102b04c84d
Update utils.py
Changed unsafe basename extraction:
`file_name = i.split('.')[0]` to `file_name = os.path.splitext(i)[0]`
and
`filename = n[i].split('.')[0]` to `filename = os.path.splitext(n[i])[0]`
because
`"Vat.sam.2_206.jpg` -> `Vat` instead of `"Vat.sam.2_206`
2025-05-11 06:09:17 -07:00
johnlockejrr
be57f137d7
Update utils.py 2025-05-11 05:31:34 -07:00
johnlockejrr
451188c3b9
Changed deprecated lr to learning_rate and model.fit_generator to model.fit 2024-10-19 13:25:50 -07:00
johnlockejrr
df4a47ae6f
Update inference.py to check if save_layout was passed as argument otherwise can give an cv2 error 2024-10-19 13:21:29 -07:00
vahidrezanezhad
cca4d17823 new augmentations for patchwise training 2024-08-30 15:30:18 +02:00
vahidrezanezhad
5f456cf508 fixing artificial class bug 2024-08-28 17:34:06 +02:00
vahidrezanezhad
c502e67c14 adding foreground rgb to augmentation 2024-08-28 02:09:27 +02:00
vahidrezanezhad
4f0e3efa2b early dilation for textline artificial class 2024-08-28 00:04:19 +02:00
vahidrezanezhad
9904846776 using prepared binarized images in the case of augmentation 2024-08-22 21:58:09 +02:00
vahidrezanezhad
f31219b1c9 scaling, channels shuffling, rgb background and red content added to no patch augmentation 2024-08-21 19:33:23 +02:00
vahidrezanezhad
95bbdf8040 updating augmentations 2024-08-21 16:17:59 +02:00
vahidrezanezhad
7be326d689 augmentation function for red textlines, rgb background and scaling for no patch case 2024-08-21 00:48:30 +02:00
vahidrezanezhad
85dd59f23e update 2024-08-09 13:20:09 +02:00
vahidrezanezhad
f4bad09083 save only layout output. different from overlayed layout on image 2024-08-09 12:46:18 +02:00
Clemens Neudecker
b6bdf942fd
add documentation from wiki as markdown file to the codebase 2024-08-08 16:35:06 +02:00
vahidrezanezhad
59e5892f25 erosion rate changed 2024-08-01 14:30:51 +02:00
vahidrezanezhad
5fbe941f53 inference updated 2024-07-24 18:00:39 +02:00
vahidrezanezhad
30894ddc75 erosion and dilation parameters are changed & separators are written in label images after artificial label 2024-07-24 16:52:05 +02:00
b-vr103
c340fbb721 increasing margin in the case of pixelwise inference 2024-07-23 11:29:05 +02:00
b-vr103
f2692cf8dd brightness augmentation modified 2024-07-17 18:20:24 +02:00
vahidrezanezhad
9521768774 adding degrading and brightness augmentation to no patches case training 2024-07-17 17:14:20 +02:00
vahidrezanezhad
55f3cb9a84 printspace_as_class_in_layout is integrated. Printspace can be defined as a class for layout segmentation 2024-07-16 18:29:27 +02:00
vahidrezanezhad
647a3f8cc4 resolving typo 2024-07-09 03:04:29 +02:00
vahidrezanezhad
c0faecec2c update inference 2024-06-21 23:42:25 +02:00
vahidrezanezhad
033cf6734b update reading order machine based 2024-06-21 13:06:26 +02:00
vahidrezanezhad
9358657a0d update config 2024-06-12 17:40:40 +02:00
vahidrezanezhad
743f2e97d6 Transformer+CNN structure is added to vision transformer type 2024-06-12 17:39:57 +02:00
vahidrezanezhad
f1fd74c7eb transformer patch size is dynamic now. 2024-06-12 13:26:27 +02:00
vahidrezanezhad
2aa216e388 binarization as a separate task of segmentation 2024-06-11 17:48:30 +02:00
vahidrezanezhad
41a0e15e79 updating train.py nontransformer backend 2024-06-10 22:15:30 +02:00
vahidrezanezhad
815e5a1d35 updating train.py 2024-06-07 16:24:31 +02:00
vahidrezanezhad
dc356a5f42 just defined graphic region types can be extracted as label 2024-06-06 18:55:22 +02:00
vahidrezanezhad
b1d971a200 just defined textregion types can be extracted as label 2024-06-06 18:47:30 +02:00
vahidrezanezhad
1c8873ffa3 just defined textregion types can be extracted as label 2024-06-06 18:45:47 +02:00
vahidrezanezhad
e25a925169
Update README.md 2024-06-06 14:46:06 +02:00
vahidrezanezhad
b9cbc0edb7 replacement in a list done correctly 2024-06-06 14:38:29 +02:00
vahidrezanezhad
821290c464 scaling and cropping of labels and org images 2024-05-30 16:59:50 +02:00
vahidrezanezhad
4640d9f2dc modifying xml parsing 2024-05-30 12:56:56 +02:00
vahidrezanezhad
785033536a min_area size of regions considered for reading order detection passed as an argument for inference 2024-05-29 13:07:06 +02:00
vahidrezanezhad
f6abefb0a8 reading order detection on xml with layout + result will be written in an output directory with the same file name 2024-05-29 11:18:35 +02:00
vahidrezanezhad
2e7c69f2ac inference for reading order 2024-05-28 16:48:51 +02:00
vahidrezanezhad
356da4cc53 min area size of text region passes as an argument for machine based reading order 2024-05-28 10:14:16 +02:00
vahidrezanezhad
29ddd4d909 pass degrading scales for image enhancement as a json file 2024-05-28 10:01:17 +02:00
vahidrezanezhad
5aa6ee0010 adding rest_as_paragraph and rest_as_graphic to elements 2024-05-27 17:23:49 +02:00
vahidrezanezhad
4e4490d740 machine based reading order training is integrated 2024-05-24 16:39:48 +02:00
vahidrezanezhad
bf1468391a machine based reading order training dataset generator is added 2024-05-24 14:42:58 +02:00
vahidrezanezhad
f5746011f6 use case printspace is added 2024-05-23 17:36:23 +02:00