The phiperio package provides utilities to import, validate, and manage PhIP-Seq datasets, including standardized conversion pipelines, data checks, and access to cached peptide metadata.
Installation
You can install the development version of phiperio from GitHub with either pak or devtools:
# install.packages("pak")
pak::pak("Polymerase3/phiperio")
# or, using devtools:
# install.packages("devtools")
devtools::install_github("Polymerase3/phiperio")Usage
For guided walk-throughs, see the pkgdown vignettes:
- Importing long tidy data (convert_standard) — cross-sectional and longitudinal tidy inputs.
-
Importing multiple files at once — batch ingest of many CSV/Parquet files with
sample_id_from_filenames. - Importing legacy PhIP-Seq data (convert_legacy) — classic wide matrices (exist/fold_change/raw counts) plus sample/timepoint metadata.
Aim and key features
phiperio focuses on reliable ingest and validation of PhIP-Seq data, so downstream analyses start from a clean, standardized base. Key features include:
-
DuckDB backend + Parquet first: uses DuckDB under the hood and writes/reads Parquet by default as the transaction layer between the
phiperdata source andphiperio, giving fast I/O and great interoperability. - Scales to millions of rows: lazy database pipelines and Parquet storage let you work efficiently with very large PhIP-Seq datasets.
-
Import helpers for common PhIP-Seq inputs and peptide metadata (peptide library cached and maintained in the companion
phiperrepo). - Strong validation and consistency checks to catch data issues early.
-
Lightweight, reproducible pipelines to standardize raw inputs into
<phip_data>objects.
Issues
Spotted a bug or want to request a feature? Please open an issue: https://github.com/Polymerase3/phiperio/issues