|
|
Configuration file is a fundamental part of the replicability we want achieve through dl_setup. It allows to specify all things that may change during different model runs. It also helps keeping cleaner and better code.
|
|
|
|
|
|
Let's suppose you want to change the learning rate of an optimizer for a specific model. Normally user changes the value inside the python code or passes an argument when they run their code.
|
|
|
|
|
|
With the first approach user must find the file in which lr is defined and if they want to change different hyper-parameters defined in different part of codebase this become tedious.
|
|
|
|
|
|
With the second approach user passes the learning rate and other hyper-parameters through command line options:
|
|
|
```python
|
|
|
python myModel.py --lr 0.001 --batch_size 32 --epochs 120
|
|
|
```
|
|
|
Also this approach is tedious in case of several hyper-parameters.
|
|
|
|
|
|
Config file allow to define all configuation parameters in only one place. After that dl_setup automatically load and parse the file making all the content available to the user inside the codebase as a normal python dictionary.
|
|
|
|
|
|
Inside the code:
|
|
|
```python
|
|
|
config = ConfigParser(config_path_passed_as_argument) # see runner script section of the wiki
|
|
|
config[section_you_want][field_you_want]
|
|
|
```
|
|
|
|
|
|
## Required fields
|
|
|
There are some required fields
|
|
|
|
|
|
### Train
|
|
|
|
|
|
##### environment
|
|
|
|
|
|
* seed → used to ensure replicability of pseudo-random number generation;
|
|
|
* epochs → number of epochs;
|
|
|
* experiment_name → name of the experiment used by [Aim](https://gitlab.fbk.eu/dsip/templates/dl_setup/-/wikis/Aim) and for [model weights saving operation](https://gitlab.fbk.eu/dsip/templates/dl_setup/-/wikis/Pytorch).
|
|
|
* run_name → name of the run used by [Aim](https://gitlab.fbk.eu/dsip/templates/dl_setup/-/wikis/Aim) and for [model weights saving operation](https://gitlab.fbk.eu/dsip/templates/dl_setup/-/wikis/Pytorch);
|
|
|
* use_early_stopper → boolean flag for early stopper usage.
|
|
|
|
|
|
##### paths section
|
|
|
|
|
|
* aim_dir → directory in which are saved [Aim](https://gitlab.fbk.eu/dsip/templates/dl_setup/-/wikis/Aim) logs;
|
|
|
* net_weights_dir → directory in which are saved net weights.
|
|
|
|
|
|
##### dataloader
|
|
|
|
|
|
* batch_size → batch size used.
|
|
|
|
|
|
### Inference
|
|
|
|
|
|
##### environment
|
|
|
|
|
|
* seed → used to ensure replicability of pseudo-random number generation;
|
|
|
* experiment_name → name of the experiment used to identify correct weights according to [pytorch setup](https://gitlab.fbk.eu/dsip/templates/dl_setup/-/wikis/Pytorch);
|
|
|
* run_name → name of the run used to identify correct weights according to [pytorch setup](https://gitlab.fbk.eu/dsip/templates/dl_setup/-/wikis/Pytorch);
|
|
|
* use_early_stopper → boolean flag used to load early stopper model weights according to [Pytorch guidelines](https://gitlab.fbk.eu/dsip/templates/dl_setup/-/wikis/Pytorch)
|
|
|
|
|
|
##### paths section
|
|
|
|
|
|
* net_weights_dir → directory in which are saved net weights.
|
|
|
|
|
|
## paths section
|
|
|
Paths written in paths section are automatically mapped to the same directory inside the container through docker volumes. Example:
|
|
|
|
|
|
```ini
|
|
|
[paths]
|
|
|
myFolder = /home/user/folder
|
|
|
```
|
|
|
|
|
|
Path */home/user/folder/* can be used also inside the contaner.
|
|
|
|
|
|
## filenames section
|
|
|
Files written in filenames section are automatically mapped to the same directory inside the container through docker volumes. Example:
|
|
|
|
|
|
```ini
|
|
|
[filenames]
|
|
|
myFile = /home/user/folder/filename.png
|
|
|
```
|
|
|
|
|
|
Path */home/user/folder/filename.png* can be used also inside the contaner.
|
|
|
|
|
|
## \_\_docker\_\_ section
|
|
|
|
|
|
\_\_docker\_\_ section is a special section used to define some common run options of docker container. There are two main fields:
|
|
|
|
|
|
* options → options passed to the docker run command (ex. --rm -it for more information see [docker section](https://gitlab.fbk.eu/dsip/templates/dl_setup/-/wikis/Docker));
|
|
|
* volumes → volumes mapping from host to container.
|
|
|
|
|
|
This section is used by the [run script](https://gitlab.fbk.eu/dsip/templates/dl_setup/-/wikis/Runner-script) when it execute docker containers.
|
|
|
|
|
|
## \_\_udocker\_\_ section
|
|
|
|
|
|
\_\_udocker\_\_ section is a special section used to define some common run options of udocker container (see [cluster guidelines](https://gitlab.fbk.eu/dsip/templates/dl_setup/-/wikis/Cluster)). There are two main fields:
|
|
|
|
|
|
* options → options passed to the docker run command (ex. --rm -it for more information see [docker section](https://gitlab.fbk.eu/dsip/templates/dl_setup/-/wikis/Docker));
|
|
|
* volumes → volumes mapping from host to container.
|
|
|
|
|
|
This section is used by the [run script](https://gitlab.fbk.eu/dsip/templates/dl_setup/-/wikis/Runner-script) when it execute docker containers.
|
|
|
|
|
|
|
|
|
## Examples
|
|
|
An example of config file for train (you can add and/or remove any element you want):
|
|
|
```ini
|
|
|
[environment]
|
|
|
seed = 0
|
|
|
num_threads = 23
|
|
|
epochs = 160
|
|
|
run_name = avoiding_overfitting_reducing_lr
|
|
|
|
|
|
[dataloader]
|
|
|
batch_size = 64
|
|
|
num_samples = 0
|
|
|
|
|
|
[optimizer]
|
|
|
lr = 0.004
|
|
|
|
|
|
[scheduler]
|
|
|
patience = 30
|
|
|
factor = 0.7
|
|
|
mode = min
|
|
|
|
|
|
[paths]
|
|
|
scaler_dir = /storage/DSIP/project/min_max_scaler/folder
|
|
|
net_weights_dir = /storage/DSIP/project/model_weights
|
|
|
aim_dir = /storage/DSIP/project/aim_logs
|
|
|
|
|
|
[filenames]
|
|
|
x = /tmp/folder/x_300m.npy
|
|
|
y = /tmp/folder/x_10m.npy
|
|
|
geo = /tmp/folder/geo.npy
|
|
|
metadata = /tmp/folder/input_metadata.pkl
|
|
|
```
|
|
|
And an example of config file for train (you can add and/or remove any element you want):
|
|
|
```ini
|
|
|
[environment]
|
|
|
experiment_name = 10m_input
|
|
|
run_name = avoiding_overfitting_reducing_lr
|
|
|
|
|
|
[paths]
|
|
|
scaler_dir = /storage/DSIP/mappiamo_upscale/project/folder
|
|
|
net_weights_dir = /storage/DSIP/project/model_weights
|
|
|
|
|
|
[filenames]
|
|
|
x = /tmp/folder/x_300m.npy
|
|
|
geo = /tmp/folder/geo.npy
|
|
|
metadata = /tmp/folder/input_metadata.pkl
|
|
|
output_file = /storage/DSIP/folder/inference_output/map_output.tif
|
|
|
```
|
|
|
An example of docker section (can be placed in train and/or inference config):
|
|
|
```ini
|
|
|
[__docker__]
|
|
|
options = --rm -it -p 65432:65432
|
|
|
volumes = /storage/DSIP/project/input:/data/input
|
|
|
/storage/DSIP/project/output:/data/output
|
|
|
/storage/DSIP/project/net_weights:/data/net_weights
|
|
|
``` |