以下流程是在数据中心的超算上的配置过程。
创建环境 1 2 3 4 5 conda create --name EGAPx_v.0.4.0-alpha conda activate EGAPx_v.0.4.0-alpha conda install bioconda::nextflow conda install conda-forge::singularity conda install -c conda-forge openjdk=11
下载文件 1 2 3 4 cd ~/software/EGAPX_v.0.4.0-alpha git clone https://github.com/ncbi/egapx.gitcd egapx python3 ui/egapx.py -dl -lc ../local_cache
拉取镜像 1 2 3 4 5 cd ~/software/EGAPX_v.0.4.0-alpha/egapxrm egap*sif singularity cache clean singularity pull docker://docker.1ms.run/ncbi/egapx:0.4.0-alpha
下载BUSCO数据库 需要下载与taxid
对应的物种适用的BUSCO
数据库。
yaml
文件是这样的:
1 2 3 4 5 6 7 8 genome: /share/org/xx/xx/software/EGAPX_v.0.4.0-alpha/example/GCA_020809275.1_ASM2080927v1_genomic.fna.gz taxid: 6954 short_reads: - /share/org/xx/xx/software/EGAPX_v.0.4.0-alpha/example/SRR8506572_1.fastq.gz - /share/org/xx/xx/software/EGAPX_v.0.4.0-alpha/example/SRR8506572_2.fastq.gz - /share/org/xx/xx/software/EGAPX_v.0.4.0-alpha/example/SRR9005248_1.fastq.gz - /share/org/xx/xx/software/EGAPX_v.0.4.0-alpha/example/SRR9005248_2.fastq.gz locus_tag_prefix: egapxtmp
这里重要的是taxid
.
下载的过程:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 python3 ui/egapx.py input_D_farinae_small.yaml -dl -lc ../local_cache /mnt/g/database/egapx/ui/egapx.py:1401: SyntaxWarning: invalid escape sequence '\[' biomol_str = " AND biomol_transcript\[properties\] " !!WARNING!! This is an alpha release with limited features and organism scope to collect initial feedback on execution. Outputs are not yet complete and not intended for production use. Downloading gnomon/2 Downloading ortholog_references/3 Downloading target_proteins/2 Downloading taxonomy/2 Downloading reference_sets/3 Downloading misc/2 Downloading cmsearch/1 Download SRA to /mnt/g/database/local_cache/sra_dir Downloading BUSCO lineage arachnida_odb10
配置文件 需要制作Nextflow
使用的配置文件:
1 2 3 4 5 cd ~/software/EGAPX_v.0.4.0-alpha/egapxmkdir egapx_config vi egapx_config/singularity.config
在这个文件中输入下面的内容:
1 2 singularity.enabled = true process.container = '/share/org/xxx/xxx/software/EGAPX_v.0.4.0-alpha/egapx/egapx_0.4.0-alpha.sif'
运行脚本 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 source ~/.zshrc conda activate EGAPx_v.0.4.0-alphaexport JAVA_HOME=/share/org/xxx/xxx/miniforge3/envs/EGAPx_v.0.4.0-alpha export PATH=$JAVA_HOME /bin:$PATH python ui/egapx.py \ input_D_farinae_small.yaml \ -e singularity \ -w test_workdir \ -o test_output \ -lc ../local_cache \ -resume
提交作业 1 sub -J "test_egapx" -o 99.log /test.out -e 99.log /test.err -q c01 -m 500 "bash 00.test_egapx.sh"