catcheR_enrichment - Enrichment / Depletion Analysis

The catcheR_enrichment function evaluates whether specific perturbation groups (genes, shRNAs, clones) are enriched or depleted in the total number of cells, or within specific subpopulations (i.e., Monocle-derived clusters).

Step-by-step

  1. Run catcheR_enrichment:

    catcheR_enrichment(
        group = c("docker", "sudo"),
        folder,
        file,
        meta,
        timepoint = "PSC",
        control_gene = "SCR",
        min_cells_cluster = 70,
        min_cells_shRNA = 40
    )
    

Example usage:

catcheR_enrichment(
    group = "docker",
    folder = "/3tb/data/ratto/aggr/test/",
    file = "processed_cds.RData",
    meta = "cell_metadata.csv",
    timepoint = "PSC",
    control_gene = "SCR"
)

Arguments

  • file: the CDS object generated by catcheR_load

  • meta: metadata CSV file (e.g., cell_metadata.csv)

  • timepoint: the baseline time point used as a reference control. Required when comparing enrichment or depletion across time points.

  • control_gene: gene or group used as the statistical reference for enrichment analysis

  • min_cells_cluster: minimum number of cells per cluster (default: 70)

  • min_cells_shRNA: minimum number of cells per perturbation group (default: 40)

Outputs

catcheR_enrichment produces the following:

  1. Group-level plots:

    • Number of cells in each perturbation group

    _images/cellsxgene_TETvsnoTET.pdf _images/cellsxclone_TETvsnoTET.pdf
  2. Volcano plot (cell counts):

    • Enrichment or depletion of cell numbers in perturbation groups

    • Based on log2 fold-change and statistical significance (compared to control)

      • bar plot of log2 fold-changes for visual comparison

    _images/volcano_plot_enrich_bigclones_TETvsnoTET.pdf
  3. Bar plots by cluster:

    • Distribution of perturbation groups across Monocle clusters

    _images/clusters_in_genes_TETvsnoTET.pdf _images/clusters_in_shRNAs_TETvsnoTET.pdf
  4. Volcano plot (cluster enrichment):

    • Results of Fisher’s exact test for distribution across clusters

    • Displays -log10(adjusted p-value) versus log2(fold change) compared to control

    _images/gene_volcano_fisher_stats_TETvsnoTET.pdf _images/shRNA_volcano_fisher_stats_TETvsnoTET.pdf
  5. Statistics table:

    • Includes detailed output of Fisher’s test statistics