qurator-spk/sbb_pixelwise_segmentation

mirror of https://github.com/qurator-spk/sbb_pixelwise_segmentation.git synced 2025-07-14 21:09:59 +02:00

Author	SHA1	Message	Date
johnlockejrr	f298643fcf	Fix `ReduceONPlateau` wrong logic # Training Script Improvements ## Learning Rate Management Fixes ### 1. ReduceLROnPlateau Implementation - Fixed the learning rate reduction mechanism by replacing the manual epoch loop with a single `model.fit()` call - This ensures proper tracking of validation metrics across epochs - Configured with: ```python reduce_lr = ReduceLROnPlateau( monitor='val_loss', factor=0.2, # More aggressive reduction patience=3, # Quick response to plateaus min_lr=1e-6, # Minimum learning rate min_delta=1e-5, # Minimum change to be considered improvement verbose=1 ) ``` ### 2. Warmup Implementation - Added learning rate warmup using TensorFlow's native scheduling - Gradually increases learning rate from 1e-6 to target (2e-5) over 5 epochs - Helps stabilize initial training phase - Implemented using `PolynomialDecay` schedule: ```python lr_schedule = tf.keras.optimizers.schedules.PolynomialDecay( initial_learning_rate=warmup_start_lr, decay_steps=warmup_epochs * steps_per_epoch, end_learning_rate=learning_rate, power=1.0 # Linear decay ) ``` ### 3. Early Stopping - Added early stopping to prevent overfitting - Configured with: ```python early_stopping = EarlyStopping( monitor='val_loss', patience=6, restore_best_weights=True, verbose=1 ) ``` ## Model Saving Improvements ### 1. Epoch-based Model Saving - Implemented custom `ModelCheckpointWithConfig` to save both model and config - Saves after each epoch with corresponding config.json - Maintains compatibility with original script's saving behavior ### 2. Best Model Saving - Saves the best model at training end - If early stopping triggers: saves the best model from training - If no early stopping: saves the final model ## Configuration All parameters are configurable through the JSON config file: ```json { "reduce_lr_enabled": true, "reduce_lr_monitor": "val_loss", "reduce_lr_factor": 0.2, "reduce_lr_patience": 3, "reduce_lr_min_lr": 1e-6, "reduce_lr_min_delta": 1e-5, "early_stopping_enabled": true, "early_stopping_monitor": "val_loss", "early_stopping_patience": 6, "early_stopping_restore_best_weights": true, "warmup_enabled": true, "warmup_epochs": 5, "warmup_start_lr": 1e-6 } ``` ## Benefits 1. More stable training with proper learning rate management 2. Better handling of training plateaus 3. Automatic saving of best model 4. Maintained compatibility with existing config saving 5. Improved training monitoring and control	2025-05-17 23:24:40 +03:00
johnlockejrr	7661080899	LR Warmup and Optimization Implementation # Learning Rate Warmup and Optimization Implementation ## Overview Added learning rate warmup functionality to improve training stability, especially when using pretrained weights. The implementation uses TensorFlow's native learning rate scheduling for better performance. ## Changes Made ### 1. Configuration Updates (`runs/train_no_patches_448x448.json`) Added new configuration parameters for warmup: ```json { "warmup_enabled": true, "warmup_epochs": 5, "warmup_start_lr": 1e-6 } ``` ### 2. Training Script Updates (`train.py`) #### A. Optimizer and Learning Rate Schedule - Replaced fixed learning rate with dynamic scheduling - Implemented warmup using `tf.keras.optimizers.schedules.PolynomialDecay` - Maintained compatibility with existing ReduceLROnPlateau and EarlyStopping ```python if warmup_enabled: lr_schedule = tf.keras.optimizers.schedules.PolynomialDecay( initial_learning_rate=warmup_start_lr, decay_steps=warmup_epochs * steps_per_epoch, end_learning_rate=learning_rate, power=1.0 # Linear decay ) optimizer = Adam(learning_rate=lr_schedule) else: optimizer = Adam(learning_rate=learning_rate) ``` #### B. Learning Rate Behavior - Initial learning rate: 1e-6 (configurable via `warmup_start_lr`) - Target learning rate: 5e-5 (configurable via `learning_rate`) - Linear increase over 5 epochs (configurable via `warmup_epochs`) - After warmup, learning rate remains at target value until ReduceLROnPlateau triggers ## Benefits 1. Improved training stability during initial epochs 2. Better handling of pretrained weights 3. Efficient implementation using TensorFlow's native scheduling 4. Configurable through JSON configuration file 5. Maintains compatibility with existing callbacks (ReduceLROnPlateau, EarlyStopping) ## Usage To enable warmup: 1. Set `warmup_enabled: true` in the configuration file 2. Adjust `warmup_epochs` and `warmup_start_lr` as needed 3. The warmup will automatically integrate with existing learning rate reduction and early stopping To disable warmup: - Set `warmup_enabled: false` or remove the warmup parameters from the configuration file	2025-05-17 16:17:38 +03:00
johnlockejrr	451188c3b9	Changed deprecated `lr` to `learning_rate` and `model.fit_generator` to `model.fit`	2024-10-19 13:25:50 -07:00
vahidrezanezhad	c502e67c14	adding foreground rgb to augmentation	2024-08-28 02:09:27 +02:00
vahidrezanezhad	f31219b1c9	scaling, channels shuffling, rgb background and red content added to no patch augmentation	2024-08-21 19:33:23 +02:00
vahidrezanezhad	95bbdf8040	updating augmentations	2024-08-21 16:17:59 +02:00
vahidrezanezhad	743f2e97d6	Transformer+CNN structure is added to vision transformer type	2024-06-12 17:39:57 +02:00
vahidrezanezhad	f1fd74c7eb	transformer patch size is dynamic now.	2024-06-12 13:26:27 +02:00
vahidrezanezhad	2aa216e388	binarization as a separate task of segmentation	2024-06-11 17:48:30 +02:00
vahidrezanezhad	41a0e15e79	updating train.py nontransformer backend	2024-06-10 22:15:30 +02:00
vahidrezanezhad	815e5a1d35	updating train.py	2024-06-07 16:24:31 +02:00
vahidrezanezhad	4e4490d740	machine based reading order training is integrated	2024-05-24 16:39:48 +02:00
vahidrezanezhad	a7e1f255f3	Update train.py avoid ensembling if no model weights met the threshold f1 score in the case of classification	2024-05-08 14:47:16 +02:00
vahidrezanezhad	8d1050ec30	inference script is added	2024-05-07 13:34:03 +02:00
vahidrezanezhad	38db3e9289	adding enhancement training	2024-05-06 18:31:48 +02:00
vahidrezanezhad	dbb84507ed	integrating first working classification training model	2024-04-29 20:59:36 +02:00
vahidrezanezhad	d27647a0f1	first working update of branch	2024-04-16 01:00:48 +02:00
cneud	02b1436f39	code formatting with black; typos	2024-04-10 22:20:23 +02:00
cneud	5f84938839	update parameter config docs (fix #11 )	2024-04-10 21:40:23 +02:00
vahidrezanezhad	522f00ab99	adjusting to tf2	2024-04-04 11:26:28 +02:00
vahid	4bea9fd535	continue training, losses and etc	2021-06-22 18:47:59 -04:00
vahid	5fb7552dbe	first updates, padding, rotations	2021-06-22 14:20:51 -04:00
vahidrezanezhad	bb212daf0b	Update main.py	2019-12-10 14:01:55 +01:00
Gerber, Mike	4897fd3dd7	📝 howto: Be more verbose with the subtree pull	2019-12-09 15:33:53 +01:00

24 commits