... | ... | @@ -11,10 +11,37 @@ |
|
|
* <span dir="">A short consideration of quality control for the data;</span>
|
|
|
* <span dir="">Calling crossovers (haplotype transitions) using sgcocaller or software that you might consider appropriate;</span>
|
|
|
* <span dir="">One or more plots about crossover frequencies, positioning and any other features that you consider interesting;</span>
|
|
|
* <span dir="">A brief discussion of any conclusions you reach (or cannot reach) from your analysis.</span>
|
|
|
* <span dir="">A brief discussion of any conclusions you reach (or cannot reach) from your analysis.</span>
|
|
|
|
|
|
<span dir="">Be prepared to briefly discuss the choices you make in your analysis. There are many ways toconduct appropriate analyses, so we are not looking for any specific approaches—whatever works tocomplete the tasks is completely fine.</span>
|
|
|
|
|
|
# The mouse genomic data
|
|
|
# The mouse genomic data
|
|
|
|
|
|
where an F1(C57LB6 x FVB) mouse was backcrossed to an FVB mouse. |
|
|
\ No newline at end of file |
|
|
The mice are from what is referred to as a "BC1F1" generation: More specifically previously two inbred mice were crossed from the strains C57BL6 and FVB. The resulting F1 mice were then backcrossed to an FVB inbred, which resulted in the "BC1F1" mice. It is these BC1F1 mice that were sequenced.
|
|
|
|
|
|
The fastq files, a list of variants, and metadata can be downloaded here.
|
|
|
|
|
|
https://stvincentsinstitute.sharepoint.com/:f:/s/everything/EhsB8UYRrChOiTjyCtUBV-UB1KddW9d0XXKe4SdCzZQr_Q?e=1b0kYd
|
|
|
|
|
|
Fastq files for five mice are provided. Whole genome coverage may be in the range from 1-5x. If this level of coverage is expected to be problematic, please let me know.
|
|
|
|
|
|
# Guidance for the analysis and resources
|
|
|
|
|
|
It is suggested to perform the detection of crossovers using sgcocaller, unless you are already familiar with another appropriate tool. Documentation and tutorials on the use of sgcocaller can be found at the following two links:
|
|
|
|
|
|
https://gitlab.svi.edu.au/biocellgen-public/sgcocaller
|
|
|
|
|
|
https://biocellgen-public.svi.edu.au/hinch-single-sperm-DNA-seq-processing/Crossover-identification-with-sscocaller-and-comapr.html
|
|
|
|
|
|
It is _critical_ that sgcocaller is bun in "bulk" mode in order to execute correctly.
|
|
|
|
|
|
Once you have output from sgcocaller, you may can perform some exploratory analysis using [comapr](https://bioconductor.org/packages/release/bioc/html/comapr.html) or any other software that that you find appropriate.
|
|
|
|
|
|
Questions to consider:
|
|
|
|
|
|
* What is the expected crossover frequency?
|
|
|
* How confident are you in the output of sgcocaller?
|
|
|
* A common fault of crossover analyses are 'dubious' close double crossovers that are only supported by a limited number of variants, that are perhaps all in the same read. In other words, reads that are mapped to regions that are not the true genomic location of the read's origin.
|
|
|
* Do you have enough samples to draw any biological conclusions.
|
|
|
|
|
|
<span dir="">This task is not meant to be unpleasant or overly onerous. We only expect you to spend 1-2 hours working on it. If you find yourself stuck, feel free to reach out to Wayne for some pointers.</span> |
|
|
\ No newline at end of file |