qcnv

qcnv is designed to act as a pre-processor for copy number analysis tools. It reads one or more BAM files and outputs the read counts within a given window-size to a VCF-like plain text file.

Installation

qcnv requires java 21 and (ideally) a multi-core machine. qcnv is threaded for increased execution speed and is relatively memory efficient at approximately 2GB for a single thread although the memory use scales linearly with thread count so using large numbers of threads requires large amounts of memory. To install:

  • Download the qcnv tar file
  • Untar the tar file into a directory of your choice

You should see jar files for qcnv and its dependencies:

> tar xjvf qcnv-0.3.tar.bz2
x antlr-3.2.jar
x commons-cli-1.2.jar
x htsjdk-1.140.jar
x jopt-simple-4.6.jar
x picard-lib.jar
x qbamfilter-1.2.jar
x qcnv-0.3pre.jar
x qcommon-0.3.jar
x qpicard-1.1.jar
x trove-3.1a1.jar

Usage

java -jar qcnv.jar [options]

Options

--help              Shows this help message.
--version           Print version.

--input, -i         Input BAM file with full path.
--id                Sample id, will be used as column name for output.
--output, -o        Output file.
--log               Log file.

--thread            Opt, Thread number, Def=2.
--window_size, -w   Opt, Window size, Def=10000.
--query, -q         Opt, Query string for selecting reads for cnv counts. 

Output Example

Multiple -i BAM files can be specified and each one can be followed by an --id option to give the file a unique name in the output file. This example shows 3 BAM files from a single cancer patient - tumour, normal, metastasis - being processed through qcnv together.

Java is being told via -Xms40g -Xmx40g that the initial and maximum heap size if 40 gigabytes. The window size is 10,000 bases (-w 10000), 16 threads can be used (-thread 16) and a qbamfilter query string is being used to only process reads that aligned against sequences wth names starting with 'chr' and tht have at least 100 bases with a CIGAR char of 'M ' indicating a match or mismatch ( -q "and(RNAME =~ chr*, cigar_M >= 100)").

 java -jar -Xms40g -Xmx40g  qcnv-0.3pre.jar -w 10000 --thread 16 \
   -i P1tumourBAM --id tumour \
   -i P1normalBAM --id normal \
   -i P1metastasisBAM --id metastasis \
   -o P1qcnv.txt --log p1qcnv.log \
   -q "and( RNAME =~ chr*, cigar_M >= 100 )"

The output shows the number of reads in each window with columns for the 3 BAM files input with labels "tumour", "normal" and "metastasis" as per the --id options.

#CHROM  ID      START   END     FORMAT  tumour  normal  metastasis
chr1    1_10000 1       10000   DP      0       0       0
chr1    2_10000 10001   20000   DP      412     0       2
chr1    3_10000 20001   30000   DP      300     0       1
chr1    4_10000 30001   40000   DP      200     159     109
...