|
|
# <span dir="">Introduction</span>
|
|
|
# **Pre-interview exercise**
|
|
|
|
|
|
<span dir="">Before your interview for the Postdoctoral Fellow position we would like you to contemplate a hypothetical data analysis task to show us some of your skills. </span>Please write a short report (less than a page) how you would do the following if you were to be assigned the hypothetical analysis below. In other words,<span dir=""> </span>_it is not necessary to download the data and tools if time does not permit, _we just want to understand<span dir=""> </span>_how you would approach this problem_. The short report is a hypothetical exercise focusing conceptually on how you would perform this analysis.
|
|
|
|
|
|
\-How do you understand the tasks?
|
|
|
|
|
|
\-What are the steps?
|
|
|
|
|
|
\-What what are the possible alternative choices/tools that you might have available?
|
|
|
|
|
|
\-If you have time, you can use an R object [here](https://gitlab.svi.edu.au/biocellgen-public/fancm-crossovers-2022/-/tree/master/output/outputR) and make some plots with a brief exploratory analysis. Example code/tutorial can be found [here](https://bioconductor.org/packages/release/bioc/vignettes/comapr/inst/doc/single-sperm-co-analysis.html)
|
|
|
|
|
|
We respect your time and please do not spend more than an hour on this pre-interview exercise. If something is simply too detailed to worry about in a one hour analysis, please skip it.
|
|
|
|
|
|
Please send the report to wcrismani@svi.edu.au no later than 4pm the day before your interview.
|
|
|
|
|
|
# **Hypothetical analysis**
|
|
|
|
|
|
#### (\*\*you do not need to download the raw data and tools if time does not permit. Please just describe how you could attempt it as specified above)
|
|
|
|
|
|
## <span dir="">Introduction</span>
|
|
|
|
|
|
<span dir="">Before your interview for the Postdoctoral Fellow position we would like you to complete a small data analysis task to show us some of your skills.</span>
|
|
|
|
|
|
<span dir="">The task is to conduct a </span>study of meiotic crossover events using bulk whole genome sequencing data from 5 mice (described further below).
|
|
|
|
|
|
<span dir="">We would like you to present a short report (either in a document or notebook) that you will submitahead of the interview. We may discuss your report in the formal interview or during the day, depending on time</span>.
|
|
|
<span dir="">We would like you to present a short report (either in a document or notebook) that you will submit ahead of the interview. We may discuss your report in the formal interview or during the day, depending on time</span>.
|
|
|
|
|
|
<span dir="">More specifically, we would like to see:</span>
|
|
|
|
... | ... | @@ -13,9 +33,9 @@ |
|
|
* <span dir="">One or more plots about crossover frequencies, positioning and any other features that you consider interesting;</span>
|
|
|
* <span dir="">A brief discussion of any conclusions you reach (or cannot reach) from your analysis.</span>
|
|
|
|
|
|
<span dir="">Be prepared to briefly discuss the choices you make in your analysis. There are many ways toconduct appropriate analyses, so we are not looking for any specific approaches—whatever works tocomplete the tasks is completely fine.</span>
|
|
|
<span dir="">Be prepared to briefly discuss the choices you make in your analysis. There are many ways to conduct appropriate analyses, so we are not looking for any specific approaches—whatever works to complete the tasks is completely fine.</span>
|
|
|
|
|
|
# The mouse genomic data
|
|
|
## The mouse genomic data
|
|
|
|
|
|
The mice are from what is referred to as a "BC1F1" generation: More specifically previously two inbred mice were crossed from the strains C57BL6 and FVB. The resulting F1 mice were then backcrossed to an FVB inbred, which resulted in the "BC1F1" mice. It is these BC1F1 mice that were sequenced.
|
|
|
|
... | ... | @@ -23,7 +43,7 @@ The fastq files, a list of variants, and metadata can be downloaded [here](https |
|
|
|
|
|
Fastq files for five mice are provided. Whole genome coverage may be in the range from 1-5x. If this level of coverage is expected to be problematic, please let me know.
|
|
|
|
|
|
# Guidance for the analysis and resources
|
|
|
## Guidance for the analysis and resources
|
|
|
|
|
|
It is suggested to perform the detection of crossovers using sgcocaller, unless you are already familiar with another appropriate tool. Documentation and tutorials on the use of sgcocaller can be found at the following two links:
|
|
|
|
... | ... | @@ -31,8 +51,6 @@ https://gitlab.svi.edu.au/biocellgen-public/sgcocaller |
|
|
|
|
|
https://biocellgen-public.svi.edu.au/hinch-single-sperm-DNA-seq-processing/Crossover-identification-with-sscocaller-and-comapr.html
|
|
|
|
|
|
It is _critical_ that sgcocaller is run in "bulk" mode in order to execute correctly.
|
|
|
|
|
|
Once you have output from sgcocaller, you may can perform some exploratory analysis using [comapr](https://bioconductor.org/packages/release/bioc/html/comapr.html) or any other software that that you find appropriate.
|
|
|
|
|
|
Questions to consider:
|
... | ... | |