Commit d49cbf46 authored by Alessia Marcolini's avatar Alessia Marcolini
Browse files

Improve snakefile section

parent 81de56da
...@@ -51,29 +51,34 @@ pip install git+https://gitlab.fbk.eu/MPBA/mlpy.git ...@@ -51,29 +51,34 @@ pip install git+https://gitlab.fbk.eu/MPBA/mlpy.git
**Example run** **Example run**
The INF pipeline is implemented as a Snakefile. The INF pipeline is implemented with a [Snakefile](https://snakemake.readthedocs.io/en/stable/index.html).
The following directory tree is required: The following directory tree is required:
* {datafolder}/{dataset}/{layer1}_{layer2}_{tr,ts}.txt * `{datafolder}/{dataset}/{target}/{split_id}/{layer}_{tr,ts,ts2}.txt`
* {datafolder}/{dataset}/labels_{target}_{tr,ts}.txt * `{datafolder}/{dataset}/{split_id}/labels_{target}_{tr,ts,ts2}.txt`
* {datafolder}/{dataset}/{layer1,layer2}_{tr,ts}.txt * `{outfolder}/{dataset}/{target}/{model}/{split_id}/{juxt,rSNF,rSNFi,single}` _(these will be created if not present)_
* {outfolder}/{dataset}/{target}/{juxt,rSNF,rSNFi,single}/ _(these will be created if not present)_
All the {variables} can be specified either in a config.yaml file or on the command line; for example: All the {variables} can be specified either in a config.yaml file or on the command line.
```{python} Example:
snakemake --config datafolder="data" dataset="breast" target="ER" layer1="gene" layer2="cnv"
```bash
snakemake --config datafolder=data outfolder=results dataset=tcga_brca target=ER layer1=gene layer2=cnv layer3=prot model=randomForest random=false split_id=0 -p
``` ```
A maximum number of cores can also be set: This example showed an example pipeline using three omics layers from BRCA-ER dataset. You can use an arbitrary number of omics layers by adding or removing `layer` arguments accordingly.
A maximum number of cores can also be set (default is 1):
```{python} ```bash
snakemake [--config etc.] --cores 12 snakemake [--config etc.] --cores 12
``` ```
The pipeline can be "dry-run" using the `-n` flag: The pipeline can be "dry-run" using the `-n` flag:
```{python} ```bash
snakemake --cores 12 -n snakemake --cores 12 -n
``` ```
A bash script (`runner.sh`) is provided for convenience, in order to run the pipeline for each split, to compute Borda of Bordas and to average metrics for all the splits.
Markdown is supported
0% or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment