sscocaller: Calling crossovers from single-sperm DNA sequencing reads
It takes the large bam file which contains aligned DNA reads from a list of single sperm cells and summarizes allele counts for informative SNP markers. A HMM model is applied for haplotyping each sperm and viterbi algorithm is run for deriving the inferred haplotype sequence against the list of SNP markers.
Inputs
- Bam, Reads from single sperm cells with BC tag, eg. single-cell alignment pipeline (cellranger)
- VCF, variant calling file that contains the list of SNPs provided
- barcodeFile, the list of cell barcodes
Outputs
- Generate per chr allele counts matrices
- Sequence of inferred viterbi state (haplotype state) per chr
Usage
Obtain allele counts for cell barcodes listed in barcodeFile at the SNP positions in VCF from the BAM file.
Usage:
sscocaller [options] <BAM> <VCF> <barcodeFile> <out_prefix>
Options:
-t --threads <threads> number of BAM decompression threads [default: 4]
-MQ --minMAPQ <mapq> Minimum MAPQ for read filtering [default: 20]
-BQ --baseq <baseq> base quality threshold for a base to be used for counting [default: 13]
-CHR --chrom <chrom> the selected chromsome (whole genome if not supplied,separate by comma if multiple chroms)
-minDP --minDP <minDP> the minimum DP for a SNP to be included in the output file [default: 1]
-maxDP --maxDP <maxDP> the maximum DP for a SNP to be included in the output file [default: 10]
-chrName --chrName <chrName> the chr names with chr prefix or not, if not supplied then no prefix
-thetaREF --thetaREF <thetaREF> the theta for the binomial distribution conditioning on hidden state being REF [default: 0.1]
-thetaALT --thetaALT <thetaALT> the theta for the binomial distribution conditioning on hidden state being ALT [default: 0.9]
-cmPmb --cmPmb <cmPmb> the average centiMorgan distances per megabases default 0.1 cm per Mb [default 0.1]
-h --help show help
Examples
./sscocaller --threads 10 AAAGTAGCACGTCTCT-1.raw.bam AAAGTAGCACGTCTCT-1.raw.bam.dp3.alt.vcf.gz barcodeFile.tsv ./percell/ccsnp-
Setup/installation
Install with nimble
sscocaller
uses hts-nim
(https://github.com/brentp/hts-nim) that requires the hts-lib
library. If you are building the sscocaller
from
source, you would need to install hts-lib
git clone --recursive https://github.com/samtools/htslib.git
cd htslib && git checkout 1.10 && autoheader && autoconf && ./configure --enable-libcurl
cd ..
make -j 4 -C htslib
export LD_LIBRARY_PATH=$HOME/htslib
ls -lh $HOME/htslib/*.so
Then, sscocaller
can be installed using nimble
nimble install https://gitlab.svi.edu.au/biocellgen-public/sscocaller.git
The built binary in $HOME/.nimble/bin/sscocaller
Using docker
sscocaller
is also available in docker image [svirlyu/sscocaller
] https://hub.docker.com/r/svirlyu/sscocaller
docker run -it svirlyu/sscocaller
## execute sscocaller
/usr/bin/sscocaller -h
Static builds
The static bianry can be simply downloaded which works for GNU/Linux type OS: ./src/sscocaller
The static build was generated by using docker image docker://svirlyu/sscocaller_nsb
/usr/local/bin/nsb -s ./src/sscocaller.nim -n sscocaller.nimble -o /mnt/src -- --d:release --threads:on
With docker image : svirlyu/sscocaller_nsb contains the required static libraries and a static binary build of sscocaller is available at "./src/sscocaller"