Skip to content
Snippets Groups Projects
Commit 08f121a3 authored by Ruqian Lyu's avatar Ruqian Lyu
Browse files

update README

parent abe49214
No related branches found
No related tags found
No related merge requests found
Pipeline #6604 failed
......@@ -10,9 +10,39 @@ sequence for the list of SNP markers.
![sscocaller_fig](images/sscocaller_fig.png)
## Hidden Markov Model configuration
- Observations. The allele specific counts across the informative SNP markers for
each chromosome in each sperm cell.
- States. Sperm cells have haploid genomes. There are two possible hidden states (haplotypes)
corresponding to a REF or ALT segment in the haploid genome. At each SNP site $i$,
there are two hidden states: $s_{i}= 0$ corresponds to ALT segment while $s_i=1$
corresponds to REF segment.
- Emission probabilities. Two binomial distributions were used for modelling the
emission probabilities for sperm cells at each SNP marker. For each site $s_i$
$$ c = c_r + c_a ~,$$
$$c_a |_{s = 0} \sim Bin(c,\theta_{ALT} ) ~,$$
$$c_a |_{s = 1}\sim Bin(c,\theta_{REF} ) ~.$$
- Transition Probabilities}. A distance-dependent transition probability [[1]](#1)
was applied, which corresponded to an average of `--cmPmb` cM (centiMorgan) per 1Mb
(1 million base pairs):
$$p_{ij} = 1-e^{(-d_{ij}\mathbf{x}0.5\mathbf{x}10^{-8})} ~,$$
where $p_{ij}$ is the transition probability of transitioning to a different
state at SNP $j$ from SNP $i$, and $d_{ij}$ denotes the physical base-pair
distances between SNP $i$ and SNP $j$.
- Initial probabilities. The initial probabilities for the two hidden states
were set to be both 0.5 since they were equally likely to happen.
## Inputs
- Bam, sorted and index bam file which contains DNA reads of single sperm cells with CB tag, eg. from single-cell alignment pipeline (cellranger)
- Bam, sorted and index bam file which contains DNA reads of single sperm cells
with `CB` tag, eg. from single-cell preprocessing pipeline (cellranger)
- VCF, variant call file that contains the list of informative SNPs
- barcodeFile, the list of cell barcodes of the sperm cells
......@@ -75,8 +105,6 @@ ls -lh $HOME/htslib/*.so
Then, `sscocaller` can be installed using `nimble`
`nimble install https://gitlab.svi.edu.au/biocellgen-public/sscocaller.git`
The built binary in $HOME/.nimble/bin/sscocaller
......@@ -115,4 +143,13 @@ sscocaller is available at "./src/sscocaller"
## Downstream analysis in R
The output files from `sscocaller` can be directly parsed into R for construction of individual genetic maps using
the R package `comapr` available from [TBD].
\ No newline at end of file
the R package `comapr` available from [TBD].
## References
<a id="1">[1]</a>
Hinch, AG. (2019).
Factors influencing meiotic recombination revealed by
whole-genome sequencing of single sperm
Science, 363(6433)
0% Loading or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment