We assume you have a conda environment with the QIIME 2 Core distribution installed. First, activate the conda environment:
conda activate qiime2-2021.2
Next, install q2-gcn-norm
with the following command:
conda install -c jiungwenchen q2-gcn-norm
This plugin normalizes sequences by 16S rRNA gene copy number (GCN) based on rrnDB database (version 5.7). The script matches the taxa of sequences with the rrnDB-5.7_pantaxa_stats_NCBI.tsv
file, starting from the lowest rank. If a match is found, the mean of GCN for the taxon is assigned; if not, the script will try to match a higher rank until the highest rank is met. All the unassigned sequences are assumed to have one GCN.
Note that the mean column might be different from the rrndb online search result. For example, the "mean" of GCN for bacteria is 2.02 in the rrnDB-5.6_pantaxa_stats_NCBI.tsv
, whereas the mean of GCN for all the bacterial taxa is 5.0 if you search rrnDB online database (version 5.6). This is because the mean column in the .tsv file is, according to the rrnDB manual, calculated from the means of the pan-taxa of immediate lower rank.
For the full tutorial, check out the post on QIIME 2 forum. If you find bugs or have suggestions, please make a post on the forum and on the Github repository.
Stoddard, S. F., Smith, B. J., Hein, R., Roller, B. R., & Schmidt, T. M. (2015). rrnDB: improved tools for interpreting rRNA gene abundance in bacteria and archaea and a new foundation for future development. Nucleic acids research, 43(Database issue), D593–D598. https://doi.org/10.1093/nar/gku1201
Chen, M. Y., Chen, J. W., Wu, L. W., Huang, K. C., Chen, J. Y., Wu, W. S., Chiang, W. F., Shih, C. J., Tsai, K. N., Hsieh, W. T., Ho, Y. H., Wong, T. Y., Wu, J. H., & Chen, Y. L. (2021). Carcinogenesis of Male Oral Submucous Fibrosis Alters Salivary Microbiomes. Journal of Dental Research, 100(4), 397–405. https://doi.org/10.1177/0022034520968750