Skip to content
GitLab
Menu
Projects
Groups
Snippets
Loading...
Help
Help
Support
Community forum
Keyboard shortcuts
?
Submit feedback
Sign in
Toggle navigation
Menu
Open sidebar
MPBA
INF
Commits
a6335076
Commit
a6335076
authored
Mar 31, 2020
by
Marco Chierici
Browse files
Add resplitter.py
parent
1b6eb13f
Changes
1
Show whitespace changes
Inline
Side-by-side
README.md
View file @
a6335076
...
...
@@ -62,12 +62,25 @@ mv tcga* data
#### Data splits generation
To recreate the 10 data splits, run the following commands in a shell:
To recreate the 10 data splits,
first
run the following commands in a shell:
```
bash
Rscript scripts/prepare_ACGT.R
--tumor
aml
--suffix
03
--datadir
data/original/Shamir_lab
--outdir
data/tcga_aml
Rscript scripts/prepare_ACGT.R
--tumor
kidney
--suffix
01
--datadir
data/original/Shamir_lab
--outdir
data/tcga_kirc
Rscript scripts/prepare_BRCA.R
--task
ER
--datadir
data/original
--outdir
data/tcga_brca
Rscript scripts/prepare_BRCA.R
--task
subtypes
--datadir
data/original
--outdir
data/tcga_brca
```
This creates 10 TR/TS partitions, with ID 0 to 9. To further partition into the 10 TR/TS/TS2 splits described in the paper, with ID 50 to 59 (you can use any other IDs), run in a shell:
```
bash
for
dataset
in
tcga_aml tcga_kirc
;
do
python resplitter.py
--datafolder
data/
$dataset
--target
OS
--n_splits_start
0
--n_splits_end
10
--split_offset
50
done
for
target
in
ER subtypes
;
do
python resplitter.py
--datafolder
data/tcga_breast
--target
$target
--n_splits_start
0
--n_splits_end
10
--split_offset
50
done
```
#### Input files
...
...
Write
Preview
Markdown
is supported
0%
Try again
or
attach a new file
.
Attach a file
Cancel
You are about to add
0
people
to the discussion. Proceed with caution.
Finish editing this message first!
Cancel
Please
register
or
sign in
to comment