参考文献
Blanco-Míguez, A., Beghini, F., Cumbo, F. et al. Extending and improving metagenomic taxonomic profiling with uncharacterized species using MetaPhlAn 4. Nat Biotechnol (2023).
官方网站
MetaPhlAn 4.0
软件安装
推荐使用mamba
安装:
1 2 3
| mamba create --name metaphlan4 mamba activate metaphlan4 mamba install -c bioconda metaphlan
|
数据库下载
默认第一次运行软件会自动下载数据库,也可以手动下载:
1 2 3
| nohup axel -n 60 http://cmprod1.cibio.unitn.it/biobakery4/metaphlan_databases/bowtie2_indexes/mpa_vOct22_CHOCOPhlAnSGB_202212_bt2.tar &
wget http://cmprod1.cibio.unitn.it/biobakery4/metaphlan_databases/mpa_vOct22_CHOCOPhlAnSGB_202212.tar
|
下载完成解压后放到默认的目录下即可,默认目录是:~/mambaforge/envs/metaphlan4/lib/python3.10/site-packages/metaphlan/metaphlan_databases
.
下载完成后运行下面的代码会自动解压并创建数据库:

开始分析
输入文件
支持fasta
、fastq
、sam
和bowtie2out
这四种格式文件作为输入文件。
示例数据
运行单个fastq文件
1
| metaphlan --offline --nproc 60 data/SRS014476-Supragingival_plaque.fasta.gz --input_type fasta > output/SRS014476-Supragingival_plaque_profile.txt
|
输出文件
1 2 3 4 5 6 7 8 9 10 11
| ❯ head data/SRS014476-Supragingival_plaque.fasta.gz.bowtie2out.txt ─╯ HWUSI-EAS1568_102539179:1:100:10007:7282/1__1.50 UniRef90_P44049|1__7|SGB9649 HWUSI-EAS1568_102539179:1:100:10008:17064/1__1.53 UniRef90_E0DI50|1__4|SGB17007 HWUSI-EAS1568_102539179:1:100:10012:9508/1__1.85 UniRef90_A0A3S4XT37|1__5|SGB17007 HWUSI-EAS1568_102539179:1:100:10013:7741/1__1.92 UniRef90_UNK19880-CIDAPOPB_00170|1__8|SGB19880 HWUSI-EAS1568_102539179:1:100:10015:15592/1__1.105 SGB6007__DMIBLHOP_01812 HWUSI-EAS1568_102539179:1:100:10025:5272/1__1.159 UniRef90_A0A3S5F533|1__7|SGB17007 HWUSI-EAS1568_102539179:1:100:10035:9129/1__1.208 UniRef90_UNK69135-EJPKGDGA_01179|7__12|SGB69135 HWUSI-EAS1568_102539179:1:100:10036:1783/1__1.212 UniRef90_A0A2A8D7T0|1__5|SGB49305 HWUSI-EAS1568_102539179:1:100:10039:18013/1__1.234 UniRef90_UNK98242-PKBBBOLB_00132|3__13|SGB98242 HWUSI-EAS1568_102539179:1:100:10047:18261/1__1.293 UniRef90_I0UTJ6|1__5|SGB16987
|
MetaPhlAn
的分类文件:起始也就是每个物种在每个样品中的丰度。
1 2 3 4 5 6 7 8 9 10 11
| head output/SRS014476-Supragingival_plaque_profile.txt
k__Bacteria 2 100.0 k__Bacteria|p__Actinobacteria 2|201174 55.36506 k__Bacteria|p__Firmicutes 2|1239 44.63494 k__Bacteria|p__Actinobacteria|c__Actinomycetia 2|201174|1760 55.36506 k__Bacteria|p__Firmicutes|c__Bacilli 2|1239|91061 44.63494
|
重新分析样本
如果要重新分析样本的话,直接使用上一步输出的bowtie2
结果就好。省去比对步骤,速度更快。
1
| metaphlan --offline --nproc 60 data/SRS014476-Supragingival_plaque.fasta.gz.bowtie2out.txt --input_type bowtie2out > output/SRS014476-Supragingival_plaque_profile.txt
|
多个样本运行
1 2 3
| for i in SRS*.fasta.gz; do metaphlan $i --input_type fasta --nproc 60 --offline > ${i%.fasta.gz}_profile.txt; done
merge_metaphlan_tables.py output/* > merged.abundance.table.txt
|
数据可视化
分类单元热图
1 2 3 4 5 6
| grep -E "s__|SRS" merged_abundance_table.txt \ | grep -v "t__" \ | sed "s/^.*|//g" \ | sed "s/SRS[0-9]*-//g" \ > merged_abundance_table_species.txt
|