Qupid

v0.1.0
Case-control matching and subsequent statistical assessment for microbiome data

install guide:

pip install qupid

Run qiime qupid --help to see all possible commands.

Matching one-to-many

Use qiime qupid match-one-to-many to match each case to all possible controls. Note that for numeric categories, you must pass in tolerances in the form of <column_name>+-<tolerance>.

qiime qupid match-one-to-many \
    --m-sample-metadata-file metadata.tsv \
    --p-case-control-column case_control \
    --p-categories sex age_years \
    --p-case-identifier case \
    --p-tolerances age_years+-10 \
    --o-case-match-one-to-many cm_one_to_many.qza

Matching one-to-one

With a one-to-many match, you can generate multiple possible one-to-one matches using qiime qupid match-one-to-one.

qiime qupid match-one-to-one \
    --i-case-match-one-to-many cm_one_to_many.qza \
    --p-iterations 10 \
    --o-case-match-collection cm_collection.qza

Qupid shuffle

The previous two commands can be run sequentially using qiime qupid shuffle.

qiime qupid shuffle \
    --m-sample-metadata-file metadata.tsv \
    --p-case-control-column case_control \
    --p-categories sex age_years \
    --p-case-identifier case \
    --p-tolerances age_years+-10 \
    --p-iterations 10 \
    --output-dir shuffle

Statistical assessment of matches

You can assess how different cases are from controls using both univariate data (such as alpha diversity) or multivariate data (distance matrices). The result will be a histogram of p-values from either a t-test (univariate) or PERMANOVA (multivariate) comparing cases to controls. Note that for either command, the input data must contain values for all possible cases and controls.

qiime qupid assess-matches-univariate \
    --i-case-match-collection cm_collection.qza \
    --m-data-file data.tsv \
    --m-data-column faith_pd \
    --o-visualization univariate_p_values.qzv
qiime qupid assess-matches-multivariate \
    --i-case-match-collection cm_collection.qza \
    --i-distance-matrix uw_unifrac_distance_matrix.qza \
    --p-permutations 999 \
    --o-visualization multivariate_p_values.qzv