README.md 1.25 KB
Newer Older
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
### INF pipeline

**Requirements**

Python3 with mlpy (!), numpy, scikit-learn
R >= 3.2.3 with cvTools, doParallel, TunePareto, igraph

To install R via Anaconda: [doc](https://docs.anaconda.com/anaconda/user-guide/tasks/using-r-language/)

To install the R dependencies, run the following command from the R prompt:

`install.packages(c("cvTools", "doParallel", "TunePareto", "igraph"))`

**Input files**

* omics layer 1 data: samples x features, tab-separated, with row & column names
* omics layer 2 data: same as above (**samples must be in the same order as the first file**)
* omics layers 1+2 data: the juxtaposition of the above two files
* labels file: one column, just the labels, no header (**same order as the data files**)

**Example run**

The original pipeline was reimplemented in a Makefile, with variables that can be set runtime.

An example is given in the `runner.sh` script:

```
make -f run_INF_RF-KBest.mk \
         OUTBASE=${OUT} \
         # layer1 dataset
         DATA1=data/AG1-G_145_LIT_ALL_tr.txt \
         # layer2 dataset
         DATA2=data/CNV-G_145_LIT_ALL_tr.txt \
         # layer1 + layer2 juxtaposed dataset
         FILE=data/AG1-G_CNV-G_145_LIT_ALL_tr.txt \
         # sample labels
         LABEL=data/label_145_ALL-EFS_tr.lab
```