Pipeline Inputs¶
This page documents all input parameters for the pipeline.
Input / output options¶
--workflow¶
Type: string | Optional
Named workflow to execute.
Select one of the three saved workflows. 'full' requires GPU/SLURM. 'ingest_export' and 'ingest_tabulate' run locally.
Default: full
Allowed values:
- full
- ingest_export
- ingest_tabulate
--input¶
Type: string | Optional | Format: file-path
Path to the samplesheet CSV file. Must contain columns sample_id, output_file_id, and species.
Default: ${projectDir}/data/samplesheet.csv
Pattern: ^\S+\.csv$
--outdir¶
Type: string | Optional | Format: directory-path
Directory where published results are written.
Default: ${projectDir}/outputs
LabKey / Prime-seq options¶
--labkey_base_url¶
Type: string | Required
Base URL of the LabKey server (e.g. https://labkey.example.org).
--labkey_folder¶
Type: string | Required
LabKey folder path (e.g. /My/Project/Folder).
Species options¶
--species_order¶
Type: string | Optional
Comma-separated list of species. Controls the order used during cross-species harmonization and scMODAL integration.
Each entry must match the 'species' column values in the samplesheet. Only species present in both the samplesheet and this list are integrated.
Default: human,macaque,mouse
--export_assay¶
Type: string | Optional
Seurat assay name to export as 10x-like count matrix in EXPORT_COUNTS.
Default: RNA
Tabulation options¶
--tabulate_id_cols¶
Type: string | Optional
Comma-separated list of subject-level identity columns to carry through to subjectIdTable.csv. cDNA_ID is always included.
Default: cDNA_ID,SubjectId,Vaccine,Timepoint,Tissue
--tabulate_celltype_cols¶
Type: string | Optional
Extra cell-type columns to tabulate in addition to the standard RIRA columns (RIRA_Immune.cellclass, RIRA_TNK_v2.cellclass, RIRA_Myeloid_v3.cellclass), which are always processed when present.
Default: ``
--tabulate_parent_col¶
Type: string | Optional
Parent column used to subset child cell-type columns. Defaults to RIRA_Immune.cellclass when empty.
Default: ``
--tabulate_celltype_parent_map¶
Type: string | Optional
Comma-separated celltype_col:parentValue pairs that extend or override the built-in hierarchy (e.g. RIRA_TNK_v2.cellclass:TNK,RIRA_Myeloid_v3.cellclass:Myeloid).
Default: ``
scMODAL integration options¶
--scmodal_container¶
Type: string | Optional
Container image used for GENE_HARMONIZE and SCMODAL_INTEGRATE. Must include scmodal, torch, scanpy, and anndata.
Default: ghcr.io/gwmcelfresh/scmodal-cuda:latest
--scmodal_latent¶
Type: integer | Optional
Dimensionality of the latent embedding produced by scMODAL.
Default: 20
--scmodal_training_steps¶
Type: integer | Optional
Number of training steps for the scMODAL VAE.
Default: 10000
--scmodal_batch_size¶
Type: integer | Optional
Mini-batch size during scMODAL training.
Default: 500
--scmodal_neighbors¶
Type: integer | Optional
Number of nearest neighbours used when building the KNN graph after integration.
Default: 30
--leiden_resolution¶
Type: number | Optional
Leiden clustering resolution applied to the scMODAL latent graph.
Default: 0.5
--scmodal_use_cpu¶
Type: boolean | Optional
CI-only flag. Bypasses the local-executor GPU guard and runs SCMODAL_INTEGRATE as a stub. Intended for GitHub Actions smoke tests only. Produces no scientifically valid output.
Default: False
Generic options¶
--help¶
Type: boolean | Optional
Display help text and exit.
Default: False
This pipeline was built with Nextflow. Documentation generated by nf-docs v0.2.1 on 2026-04-16 21:10:06 UTC.