Skip to content

Pipeline Inputs

This page documents all input parameters for the pipeline.

Input / output options

--workflow

Type: string | Optional

Named workflow to execute.

Select one of the three saved workflows. 'full' requires GPU/SLURM. 'ingest_export' and 'ingest_tabulate' run locally.

Default: full

Allowed values: - full - ingest_export - ingest_tabulate

--input

Type: string | Optional | Format: file-path

Path to the samplesheet CSV file. Must contain columns sample_id, output_file_id, and species.

Default: ${projectDir}/data/samplesheet.csv

Pattern: ^\S+\.csv$

--outdir

Type: string | Optional | Format: directory-path

Directory where published results are written.

Default: ${projectDir}/outputs

LabKey / Prime-seq options

--labkey_base_url

Type: string | Required

Base URL of the LabKey server (e.g. https://labkey.example.org).

--labkey_folder

Type: string | Required

LabKey folder path (e.g. /My/Project/Folder).

Species options

--species_order

Type: string | Optional

Comma-separated list of species. Controls the order used during cross-species harmonization and scMODAL integration.

Each entry must match the 'species' column values in the samplesheet. Only species present in both the samplesheet and this list are integrated.

Default: human,macaque,mouse

--export_assay

Type: string | Optional

Seurat assay name to export as 10x-like count matrix in EXPORT_COUNTS.

Default: RNA

Tabulation options

--tabulate_id_cols

Type: string | Optional

Comma-separated list of subject-level identity columns to carry through to subjectIdTable.csv. cDNA_ID is always included.

Default: cDNA_ID,SubjectId,Vaccine,Timepoint,Tissue

--tabulate_celltype_cols

Type: string | Optional

Extra cell-type columns to tabulate in addition to the standard RIRA columns (RIRA_Immune.cellclass, RIRA_TNK_v2.cellclass, RIRA_Myeloid_v3.cellclass), which are always processed when present.

Default: ``

--tabulate_parent_col

Type: string | Optional

Parent column used to subset child cell-type columns. Defaults to RIRA_Immune.cellclass when empty.

Default: ``

--tabulate_celltype_parent_map

Type: string | Optional

Comma-separated celltype_col:parentValue pairs that extend or override the built-in hierarchy (e.g. RIRA_TNK_v2.cellclass:TNK,RIRA_Myeloid_v3.cellclass:Myeloid).

Default: ``

scMODAL integration options

--scmodal_container

Type: string | Optional

Container image used for GENE_HARMONIZE and SCMODAL_INTEGRATE. Must include scmodal, torch, scanpy, and anndata.

Default: ghcr.io/gwmcelfresh/scmodal-cuda:latest

--scmodal_latent

Type: integer | Optional

Dimensionality of the latent embedding produced by scMODAL.

Default: 20

--scmodal_training_steps

Type: integer | Optional

Number of training steps for the scMODAL VAE.

Default: 10000

--scmodal_batch_size

Type: integer | Optional

Mini-batch size during scMODAL training.

Default: 500

--scmodal_neighbors

Type: integer | Optional

Number of nearest neighbours used when building the KNN graph after integration.

Default: 30

--leiden_resolution

Type: number | Optional

Leiden clustering resolution applied to the scMODAL latent graph.

Default: 0.5

--scmodal_use_cpu

Type: boolean | Optional

CI-only flag. Bypasses the local-executor GPU guard and runs SCMODAL_INTEGRATE as a stub. Intended for GitHub Actions smoke tests only. Produces no scientifically valid output.

Default: False

Generic options

--help

Type: boolean | Optional

Display help text and exit.

Default: False


This pipeline was built with Nextflow. Documentation generated by nf-docs v0.2.1 on 2026-04-16 21:10:06 UTC.