Skip to content
Snippets Groups Projects
rlyu's avatar
Ruqian Lyu authored
f91894fb
History

sscocaller: Calling crossovers from single-sperm DNA sequencing reads

It takes the large bam file which contains aligned DNA reads from a list of single sperm cells and summarizes allele counts for informative SNP markers. A HMM model is applied for haplotyping each sperm and viterbi algorithm is run for deriving the inferred haplotype sequence against the list of SNP markers.

Inputs

  • Bam, Reads from single sperm cells with BC tag, eg. single-cell alignment pipeline (cellranger)
  • VCF, variant calling file that contains the list of SNPs provided
  • barcodeFile, the list of cell barcodes

Outputs

  • Generate per chr allele counts matrices
  • Sequence of inferred viterbi state (haplotype state) per chr

Usage

Obtain allele counts for cell barcodes listed in barcodeFile at the SNP positions in VCF from the BAM file.

Usage:
    sscocaller [options] <BAM> <VCF> <barcodeFile> <out_prefix>


Options:
    -t --threads <threads> number of BAM decompression threads [default: 4]
    -MQ --minMAPQ <mapq> Minimum MAPQ for read filtering [default: 20]
    -BQ --baseq <baseq>  base quality threshold for a base to be used for counting [default: 13]
    -CHR --chrom <chrom> the selected chromsome (whole genome if not supplied,separate by comma if multiple chroms)
    -minDP --minDP <minDP> the minimum DP for a SNP to be included in the output file [default: 1]
    -maxDP --maxDP <maxDP> the maximum DP for a SNP to be included in the output file [default: 10]
    -chrName --chrName <chrName> the chr names with chr prefix or not, if not supplied then no prefix
    -thetaREF --thetaREF <thetaREF> the theta for the binomial distribution conditioning on hidden state being REF [default: 0.1]
    -thetaALT --thetaALT <thetaALT> the theta for the binomial distribution conditioning on hidden state being ALT [default: 0.9]
    -cmPmb --cmPmb <cmPmb> the average centiMorgan distances per megabases default 0.1 cm per Mb [default 0.1]
    -h --help  show help

Examples
    ./sscocaller --threads 10 AAAGTAGCACGTCTCT-1.raw.bam AAAGTAGCACGTCTCT-1.raw.bam.dp3.alt.vcf.gz barcodeFile.tsv ./percell/ccsnp-

Setup/installation

Install with nimble

sscocaller uses hts-nim(https://github.com/brentp/hts-nim) that requires the hts-lib library. If you are building the sscocaller from source, you would need to install hts-lib

git clone --recursive https://github.com/samtools/htslib.git
cd htslib && git checkout 1.10 && autoheader && autoconf && ./configure --enable-libcurl

cd ..
make -j 4 -C htslib
export LD_LIBRARY_PATH=$HOME/htslib
ls -lh $HOME/htslib/*.so

Then, sscocaller can be installed using nimble

nimble install https://gitlab.svi.edu.au/biocellgen-public/sscocaller.git

The built binary in $HOME/.nimble/bin/sscocaller

Using docker

sscocaller is also available in docker image [svirlyu/sscocaller] https://hub.docker.com/r/svirlyu/sscocaller

docker run -it svirlyu/sscocaller

## execute sscocaller
/usr/bin/sscocaller -h

Static builds

The static bianry can be simply downloaded which works for GNU/Linux type OS: ./src/sscocaller

The static build was generated by using docker image docker://svirlyu/sscocaller_nsb

/usr/local/bin/nsb -s ./src/sscocaller.nim -n sscocaller.nimble -o /mnt/src -- --d:release --threads:on

With docker image : svirlyu/sscocaller_nsb contains the required static libraries and a static binary build of sscocaller is available at "./src/sscocaller"