这次, 我要实现这个路程图.
(snake_test) [dengfei@localhost ex4]$ ls *txt
1.txt 2.txt 3.txt
(snake_test) [dengfei@localhost ex4]$ cat *txt
this is 1.txt
this is 2.txt
this is 3.txt
对应的Snakefile内容如下:
rule adda:
input: "{file}.txt"
output: "{file}_add_a.txt"
shell: "cat {input} |xargs echo add a >{output}"
预览一下命令:snakemake -np {1,2,3}_add_a.txt
注意: 这里要把生成的文件{1,2,3}_add_a.txt写出来, 命令才可以运行.
(snake_test) [dengfei@localhost ex4]$ snakemake -np {1,2,3}_add_a.txt
Building DAG of jobs...
Job counts:
count jobs
3 adda
3
[Tue Apr 2 21:09:19 2019]
rule adda:
input: 3.txt
output: 3_add_a.txt
jobid: 2
wildcards: file=3
cat 3.txt |xargs echo add a >3_add_a.txt
[Tue Apr 2 21:09:19 2019]
rule adda:
input: 2.txt
output: 2_add_a.txt
jobid: 0
wildcards: file=2
cat 2.txt |xargs echo add a >2_add_a.txt
[Tue Apr 2 21:09:19 2019]
rule adda:
input: 1.txt
output: 1_add_a.txt
jobid: 1
wildcards: file=1
cat 1.txt |xargs echo add a >1_add_a.txt
Job counts:
count jobs
3 adda
3
This was a dry-run (flag -n). The order of jobs does not reflect the order of execution.
执行命令:
snakemake {1,2,3}_add_a.txt
Building DAG of jobs...
Using shell: /usr/bin/bash
Provided cores: 1
Rules claiming more threads will be scaled down.
Job counts:
count jobs
3 adda
3
[Tue Apr 2 21:11:09 2019]
rule adda:
input: 3.txt
output: 3_add_a.txt
jobid: 0
wildcards: file=3
[Tue Apr 2 21:11:09 2019]
Finished job 0.
1 of 3 steps (33%) done
[Tue Apr 2 21:11:09 2019]
rule adda:
input: 1.txt
output: 1_add_a.txt
jobid: 1
wildcards: file=1
[Tue Apr 2 21:11:09 2019]
Finished job 1.
2 of 3 steps (67%) done
[Tue Apr 2 21:11:09 2019]
rule adda:
input: 2.txt
output: 2_add_a.txt
jobid: 2
wildcards: file=2
[Tue Apr 2 21:11:09 2019]
Finished job 2.
3 of 3 steps (100%) done
Complete log: /home/dengfei/test/snakemake/ex4/.snakemake/log/2019-04-02T211109.153566.snakemake.log
查看*add_a.txt文件:
(snake_test) [dengfei@localhost ex4]$ ls *add_a.txt
1_add_a.txt 2_add_a.txt 3_add_a.txt
(snake_test) [dengfei@localhost ex4]$ cat *add_a.txt
add a this is 1.txt
add a this is 2.txt
add a this is 3.txt
搞定.
对应的Snakefile内容如下:
rule adda:
input: "{file}.txt"
output: "{file}_add_a.txt"
shell: "cat {input} |xargs echo add a >{output}"
rule addb:
input:
"{file}_add_a.txt"
output:
"{file}_add_a_add_b.txt"
shell:
"cat {input} | xargs echo add b >{output}"
预览一下命令:snakemake -np {1,2,3}_add_a_add_b.txt
(snake_test) [dengfei@localhost ex4]$ snakemake {1,2,3}_add_a_add_b.txt
Building DAG of jobs...
Using shell: /usr/bin/bash
Provided cores: 1
Rules claiming more threads will be scaled down.
Job counts:
count jobs
3 addb
3
[Tue Apr 2 21:13:57 2019]
rule addb:
input: 2_add_a.txt
output: 2_add_a_add_b.txt
jobid: 0
wildcards: file=2
[Tue Apr 2 21:13:57 2019]
Finished job 0.
1 of 3 steps (33%) done
[Tue Apr 2 21:13:57 2019]
rule addb:
input: 1_add_a.txt
output: 1_add_a_add_b.txt
jobid: 1
wildcards: file=1
[Tue Apr 2 21:13:57 2019]
Finished job 1.
2 of 3 steps (67%) done
[Tue Apr 2 21:13:57 2019]
rule addb:
input: 3_add_a.txt
output: 3_add_a_add_b.txt
jobid: 2
wildcards: file=3
[Tue Apr 2 21:13:57 2019]
Finished job 2.
3 of 3 steps (100%) done
Complete log: /home/dengfei/test/snakemake/ex4/.snakemake/log/2019-04-02T211357.666661.snakemake.log
执行命令:
snakemake {1,2,3}_add_a_add_b.txt
查看流程图
命令:
snakemake --dag {1,2,3}_add_a_add_b.txt |dot -Tpdf >a.pdf
这里生成的a.pdf如下:
Snakemake命令:
rule adda:
input: "{file}.txt"
output: "{file}_add_a.txt"
shell: "cat {input} |xargs echo add a >{output}"
rule addb:
input:
"{file}_add_a.txt"
output:
"{file}_add_a_add_b.txt"
shell:
"cat {input} | xargs echo add b >{output}"
rule addc:
input:
"{file}_add_a_add_b.txt"
output:
"{file}_add_a_add_b_add_c.txt"
shell:
"cat {input} | xargs echo add c >{output}"
流程图:
命令:
snakemake --dag {1,2,3}_add_a_add_b_add_c.txt |dot -Tpdf >a1.pdf
rule adda:
input: "{file}.txt"
output: "{file}_add_a.txt"
shell: "cat {input} |xargs echo add a >{output}"
rule addb:
input:
"{file}_add_a.txt"
output:
"{file}_add_a_add_b.txt"
shell:
"cat {input} | xargs echo add b >{output}"
rule addc:
input:
"{file}_add_a_add_b.txt"
output:
"{file}_add_a_add_b_add_c.txt"
shell:
"cat {input} | xargs echo add c >{output}"
rule hebing:
input:
a=expand("{file}_add_a_add_b_add_c.txt",file=["1","2","3"]),
b=expand("{file}_add_a_add_b.txt",file=["1","2"])
output:"hebing.txt"
shell:"cat {input.a} {input.b} >{output}"
执行命令:
snakemake hebing.txt
执行结果:
Building DAG of jobs...
Using shell: /usr/bin/bash
Provided cores: 1
Rules claiming more threads will be scaled down.
Job counts:
count jobs
3 addc
1 hebing
4
[Tue Apr 2 21:21:04 2019]
rule addc:
input: 1_add_a_add_b.txt
output: 1_add_a_add_b_add_c.txt
jobid: 1
wildcards: file=1
[Tue Apr 2 21:21:04 2019]
Finished job 1.
1 of 4 steps (25%) done
[Tue Apr 2 21:21:04 2019]
rule addc:
input: 3_add_a_add_b.txt
output: 3_add_a_add_b_add_c.txt
jobid: 3
wildcards: file=3
[Tue Apr 2 21:21:04 2019]
Finished job 3.
2 of 4 steps (50%) done
[Tue Apr 2 21:21:04 2019]
rule addc:
input: 2_add_a_add_b.txt
output: 2_add_a_add_b_add_c.txt
jobid: 2
wildcards: file=2
[Tue Apr 2 21:21:04 2019]
Finished job 2.
3 of 4 steps (75%) done
[Tue Apr 2 21:21:04 2019]
rule hebing:
input: 1_add_a_add_b_add_c.txt, 2_add_a_add_b_add_c.txt, 3_add_a_add_b_add_c.txt, 1_add_a_add_b.txt, 2_add_a_add_b.txt
output: hebing.txt
jobid: 0
[Tue Apr 2 21:21:04 2019]
Finished job 0.
4 of 4 steps (100%) done
Complete log: /home/dengfei/test/snakemake/ex4/.snakemake/log/2019-04-02T212104.719887.snakemake.log
流程图:
欢迎关注我的公众号: R-breeding
snakemake 学习笔记1 snakemake 学习笔记2
今天测试了一下rule all
的功能, 它是定义输出文件的, 如果没有定义, 需要在命令行中书写.
因为最后的输出文件是hebing.txt
, 所以我们这里在Snakefile中定义一下输出文件.
rule all:
input:"hebing.txt"
rule adda:
input: "{file}.txt"
output: "{file}_add_a.txt"
shell: "cat {input} |xargs echo add a >{output}"
rule addb:
input:
"{file}_add_a.txt"
output:
"{file}_add_a_add_b.txt"
shell:
"cat {input} | xargs echo add b >{output}"
rule addc:
input:
"{file}_add_a_add_b.txt"
output:
"{file}_add_a_add_b_add_c.txt"
shell:
"cat {input} | xargs echo add c >{output}"
rule hebing:
input:
a=expand("{file}_add_a_add_b_add_c.txt",file=["1","2","3"]),
b=expand("{file}_add_a_add_b.txt",file=["1","2"])
output:"hebing.txt"
shell:"cat {input.a} {input.b} >{output}"
执行命令:
snakemake
结果如下:
(base) [dengfei@localhost ex4]$ snakemake
Provided cores: 1
Rules claiming more threads will be scaled down.
Job counts:
count jobs
3 adda
3 addb
3 addc
1 all
1 hebing
11
rule adda:
input: 1.txt
output: 1_add_a.txt
jobid: 7
wildcards: file=1
Finished job 7.
1 of 11 steps (9%) done
rule adda:
input: 2.txt
output: 2_add_a.txt
jobid: 9
wildcards: file=2
Finished job 9.
2 of 11 steps (18%) done
rule adda:
input: 3.txt
output: 3_add_a.txt
jobid: 10
wildcards: file=3
Finished job 10.
3 of 11 steps (27%) done
rule addb:
input: 3_add_a.txt
output: 3_add_a_add_b.txt
jobid: 8
wildcards: file=3
Finished job 8.
4 of 11 steps (36%) done
rule addb:
input: 1_add_a.txt
output: 1_add_a_add_b.txt
jobid: 3
wildcards: file=1
Finished job 3.
5 of 11 steps (45%) done
rule addb:
input: 2_add_a.txt
output: 2_add_a_add_b.txt
jobid: 6
wildcards: file=2
Finished job 6.
6 of 11 steps (55%) done
rule addc:
input: 3_add_a_add_b.txt
output: 3_add_a_add_b_add_c.txt
jobid: 5
wildcards: file=3
Finished job 5.
7 of 11 steps (64%) done
rule addc:
input: 2_add_a_add_b.txt
output: 2_add_a_add_b_add_c.txt
jobid: 2
wildcards: file=2
Finished job 2.
8 of 11 steps (73%) done
rule addc:
input: 1_add_a_add_b.txt
output: 1_add_a_add_b_add_c.txt
jobid: 4
wildcards: file=1
Finished job 4.
9 of 11 steps (82%) done
rule hebing:
input: 1_add_a_add_b_add_c.txt, 2_add_a_add_b_add_c.txt, 3_add_a_add_b_add_c.txt, 1_add_a_add_b.txt, 2_add_a_add_b.txt
output: hebing.txt
jobid: 1
Finished job 1.
10 of 11 steps (91%) done
localrule all:
input: hebing.txt
jobid: 0
Finished job 0.
11 of 11 steps (100%) done
查看结果:
(base) [dengfei@localhost ex4]$ cat hebing.txt
add c add b add a this is 1.txt
add c add b add a this is 2.txt
add c add b add a this is 3.txt
add b add a this is 1.txt
add b add a this is 2.txt
snakemake如果是默认的名称, 为Snakefile, 但是这样写没有高亮, 可以写为a.py
, 然后用snakemake -s a.py
运行即可.
rule all:
input:"hebing.txt"
rule adda:
input: "{file}.txt"
output: "{file}_add_a.txt"
shell: "cat {input} |xargs echo add a >{output}"
rule addb:
input:
"{file}_add_a.txt"
output:
"{file}_add_a_add_b.txt"
shell:
"cat {input} | xargs echo add b >{output}"
rule addc:
input:
"{file}_add_a_add_b.txt"
output:
"{file}_add_a_add_b_add_c.txt"
shell:
"cat {input} | xargs echo add c >{output}"
rule hebing:
input:
a=expand("{file}_add_a_add_b_add_c.txt",file=["1","2","3"]),
b=expand("{file}_add_a_add_b.txt",file=["1","2"])
output:"hebing.txt"
shell:"cat {input.a} {input.b} >{output}"
执行结果:
(base) [dengfei@localhost ex4]$ snakemake -s a.py
Provided cores: 1
Rules claiming more threads will be scaled down.
Job counts:
count jobs
3 adda
3 addb
3 addc
1 all
1 hebing
11
rule adda:
input: 1.txt
output: 1_add_a.txt
jobid: 8
wildcards: file=1
Finished job 8.
1 of 11 steps (9%) done
rule adda:
input: 3.txt
output: 3_add_a.txt
jobid: 10
wildcards: file=3
Finished job 10.
2 of 11 steps (18%) done
rule adda:
input: 2.txt
output: 2_add_a.txt
jobid: 9
wildcards: file=2
Finished job 9.
3 of 11 steps (27%) done
rule addb:
input: 3_add_a.txt
output: 3_add_a_add_b.txt
jobid: 7
wildcards: file=3
Finished job 7.
4 of 11 steps (36%) done
rule addb:
input: 2_add_a.txt
output: 2_add_a_add_b.txt
jobid: 4
wildcards: file=2
Finished job 4.
5 of 11 steps (45%) done
rule addb:
input: 1_add_a.txt
output: 1_add_a_add_b.txt
jobid: 3
wildcards: file=1
Finished job 3.
6 of 11 steps (55%) done
rule addc:
input: 3_add_a_add_b.txt
output: 3_add_a_add_b_add_c.txt
jobid: 2
wildcards: file=3
Finished job 2.
7 of 11 steps (64%) done
rule addc:
input: 2_add_a_add_b.txt
output: 2_add_a_add_b_add_c.txt
jobid: 5
wildcards: file=2
Finished job 5.
8 of 11 steps (73%) done
rule addc:
input: 1_add_a_add_b.txt
output: 1_add_a_add_b_add_c.txt
jobid: 6
wildcards: file=1
Finished job 6.
9 of 11 steps (82%) done
rule hebing:
input: 1_add_a_add_b_add_c.txt, 2_add_a_add_b_add_c.txt, 3_add_a_add_b_add_c.txt, 1_add_a_add_b.txt, 2_add_a_add_b.txt
output: hebing.txt
jobid: 1
Finished job 1.
10 of 11 steps (91%) done
localrule all:
input: hebing.txt
jobid: 0
Finished job 0.
11 of 11 steps (100%) done