catcheR_step2QC - Pooled Cloning Step 2 and hiPSC Genome Editing QC

This step complements Supplemental Protocol 1.

  1. In a new working folder, prepare the following files:

    1. Fastq/fq or fastq.gz files with demultiplexed read 1 from the NGS run.

    2. rc_barcodes_genes.csv — a CSV file with two columns: (1) the shRNA barcodes (2) the matching shRNA names in the format “GENE.shRNAID”

    CAAGAGCC,SMAD2.1
    ...
    

    Note

    The file must be comma-separated, with no extra spaces before or after barcodes or gene names. It is automatically detected and must be named exactly rc_barcodes_genes.csv.

    1. (Optional) A .txt file with a list of clones of interest, formatted as BC_UCI (see step 1c in SPFourThree)

  2. Run catcheR_step2QC:

    catcheR_step2QC(
        group = c("docker", "sudo"),
        folder,
        fastq.read1,
        DIs = 1000,
        clones = NULL
    )
    

    `catcheR_step2QC` arguments:

    1. group: string, either “sudo” or “docker” depending on user permissions (See: https://docs.docker.com/engine/install/linux-postinstall/ *)

    2. folder: string with the path to the working directory

    3. fastq.read1: string with the read 1 filename from step 1a

    4. DIs: integer, minimum number of diversity indexes (DIs) for a given UCI-BC Used to filter for reliably measured UCI-BCs

    5. clones: (optional) string with the .txt file name from step 1c

    Example usage:

    catcheR_step2QC(
        group = "docker",
        folder = "path/to/working/folder",
        fastq.read1 = "filename.fq",
        clones = "filename.txt"
    )
    

    `catcheR_step2QC` key outputs:

    1. Pie charts showing DIs per shRNA and per gene target (optionally, also per clone of interest)

    2. Text file listing all clones above the DI threshold

    3. Bar chart of the number of DIs per clone above the DI threshold

_images/step2QC_clone_percentage_filter.jpg