Computes feature-level prevalence across binary group columns and performs pairwise statistical tests:
Unpaired: Fisher's exact test (2x2) per (rank, feature, group pair).
Paired: McNemar's exact binomial test per subject.
Presence per sample is determined by a k-of-n rule: a feature is considered
present in a sample if at least pop_k_min of its contributing peptides are
positive. Each group_col must have exactly 2 non-missing levels in the data.
Usage
compute_pop(
x,
rank_cols,
group_cols,
exist_col = "exist",
pop_k_min = 1L,
paired = FALSE,
peptide_library = NULL
)Arguments
- x
A
phip_dataobject or a data.frame/tibble with at least columnssample_id,peptide_id,exist_col, and allgroup_cols. Ifpairedis set, the paired linking column must also be present. Ifrank_colsinclude non-peptide taxa,xmust providepeptide_library.- rank_cols
Character vector of rank columns, e.g.
c("peptide_id", "species").- group_cols
Character vector of binary grouping columns. Each column must have exactly 2 non-missing levels in the data.
- exist_col
Name of the binary presence column (default
"exist").- pop_k_min
Integer >= 1; k-of-n POP threshold per sample (default 1).
- paired
FALSE(default) or a single string naming the column that links related samples (e.g."subject_id"). When set, paired McNemar tests are used instead of Fisher.- peptide_library
Optional peptide metadata table with
peptide_idand any non-peptide rank columns. IfNULL, taken fromx$peptide_library.