ntvar

Calls nucleotide variants observed witin an aligned BAM file when compared against a supplied reference file.

Basic Usage

quasitools ntvar [options] <BAM file> <reference file>

Arguments

BAM File

A BAM file (.bam) of sequences aligned to a related reference. A BAM index file (.bai) is also required and should be named the same as the BAM file, with the extension instead changed from ".bam" to ".bai".

Reference File

A reference file related to the aligned BAM sequences. The provided reference file must be the same reference file used when producing the BAM and BAM index files.

Options

Error Rate

-e, --error_rate FLOAT

This is the expected substitution sequencing error rate. The default value is 0.0021 substitutions per sequenced base.

Output

-o, --output FILENAME

The file output location to write the identified nucleotide variants in VCF format. Otherwise, the results are printed to standard output.

Output

The output of the tool is a list of the variants observed witin the aligned BAM file when compared against the supplied reference file. The output is in VCF format and the quasitools provides the following additional information in the FILTER and INFO columns of the VCF file.

Please see Data Formats for more information about the custom fields used by quasitools. However, a short description is provided below.

FILTER

Name Meaning
dp100 This variant was filtered because the coverage depth was less than 100.
q30 This variant was filtered because the quality of the variant was less than 30.
ac5 This variant was filtered because the variant was observed less than 5 times.

INFO

Name Meaning
DP The total coverage depth of the pileup at this position.
AC The number of times this particular variants was observed in the pileup at this position.
AF The frequency of this particular variants was observed in the pileup at this position.

Example

Data

The following example data may be used to run the tool:

Command

quasitools call ntvar variant.bam hiv.fasta

Output

##fileformat=VCFv4.2
##fileDate=20190206
##source=quasitools
##contig=<ID=AF033819.3,length=9181>
##INFO=<ID=DP,Number=1,Type=Integer,Description="Total Depth">
##INFO=<ID=AC,Number=A,Type=Integer,Description="Allele Count">
##INFO=<ID=AF,Number=A,Type=Float,Description="Allele Frequency">
##FILTER=<ID=dp100,Description="Set if True; DP<100">
##FILTER=<ID=q30,Description="Set if True; QUAL<30">
##FILTER=<ID=ac5,Description="Set if True; AC<5">
#CHROM  POS ID  REF ALT QUAL    FILTER  INFO
AF033819.3  153 .   t   a   100 PASS    DP=129;AC=129;AF=1.0000
AF033819.3  342 .   g   c   100 PASS    DP=141;AC=141;AF=1.0000
AF033819.3  719 .   c   g   100 PASS    DP=132;AC=132;AF=1.0000
AF033819.3  1025    .   a   t   100 PASS    DP=129;AC=129;AF=1.0000
AF033819.3  1351    .   c   a   100 PASS    DP=135;AC=135;AF=1.0000
AF033819.3  1657    .   a   c   100 PASS    DP=111;AC=111;AF=1.0000
AF033819.3  1917    .   g   a   100 PASS    DP=128;AC=128;AF=1.0000
AF033819.3  2052    .   t   g   100 PASS    DP=147;AC=147;AF=1.0000
AF033819.3  2368    .   a   t   100 PASS    DP=140;AC=140;AF=1.0000
AF033819.3  2422    .   g   c   100 PASS    DP=146;AC=146;AF=1.0000
AF033819.3  2989    .   a   g   100 PASS    DP=108;AC=108;AF=1.0000
AF033819.3  5707    .   t   a   100 PASS    DP=119;AC=119;AF=1.0000
AF033819.3  5970    .   a   c   100 PASS    DP=117;AC=117;AF=1.0000
AF033819.3  6139    .   c   g   100 PASS    DP=138;AC=138;AF=1.0000
AF033819.3  6674    .   a   t   100 PASS    DP=142;AC=142;AF=1.0000
AF033819.3  7366    .   a   c   100 PASS    DP=129;AC=129;AF=1.0000
AF033819.3  8631    .   c   t   100 PASS    DP=144;AC=144;AF=1.0000