Skip to content


Pre-interview exercise

Before your interview for the Postdoctoral Fellow position we would like you to contemplate a hypothetical data analysis task to show us some of your skills. Please write a short report (less than a page) how you would do the following if you were to be assigned the hypothetical analysis below. In other words, it is not necessary to download the data and tools if time does not permit, we just want to understand _how you would approach this problem_. The short report is a hypothetical exercise focusing conceptually on how you would perform this analysis.

-How do you understand the tasks?

-What are the steps?

-What what are the possible alternative choices/tools that you might have available?

-If you have time, you can use an R object here and make some plots with a brief exploratory analysis. These R objects are the output of sgcocaller and comapr workflows. Example code/tutorial can be found here. It is fine to reuse code from the tutorial, but please acknowledge code which has been reused, and explain its purpose in your own words.

We respect your time and please do not spend more than an hour on this pre-interview exercise. If something is simply too detailed to worry about in a one hour analysis, please skip it.

Please send the report to no later than 4pm the day before your interview.



Before your interview for the Postdoctoral Fellow position we would like you to complete a small data analysis task to show us some of your skills.

The task is to conduct a study of meiotic crossover events using either the pre-processed bulk whole genome sequencing data or pre-processed single-sperm seqeuncing (described further below).

We would like you to present a short report (preferably as an R markdown report which includes any code used) that you will submit ahead of the interview. We may discuss your report in the formal interview or during the day, depending on time.

More specifically, we would like to see:

  • A short consideration of quality control for the data;
  • Calling crossovers (haplotype transitions) using sgcocaller or software that you might consider appropriate;
  • One or more plots about crossover frequencies, positioning and any other features that you consider interesting;
  • A brief discussion of any conclusions you reach (or cannot reach) from your analysis.

Be prepared to briefly discuss the choices you make in your analysis. There are many ways to conduct appropriate analyses, so we are not looking for any specific approaches—whatever works to complete the tasks is completely fine.

The mouse genomic data

The mice are from what is referred to as a "BC1F1" generation: More specifically previously two inbred mice were crossed from the strains C57BL6 and FVB. The resulting F1 mice were then backcrossed to an FVB inbred, which resulted in the "BC1F1" mice. It is these BC1F1 mice that were sequenced at around 1-5x coverage. Sperm from F1 mice were also sequenced, generally below 1x coverage, and an R object for these is provided too here

Guidance for the analysis and resources

You may can perform some exploratory analysis using comapr or any other software that that you find appropriate.

Questions to consider:

  • What is the expected crossover frequency?
  • How confident are you in the output of sgcocaller?
  • A common fault of crossover analyses are 'dubious' close double crossovers that are only supported by a limited number of variants, that are perhaps all in the same read. In other words, reads that are mapped to regions that are not the true genomic location of the read's origin. Are there any dubious close double crossovers in these sgcocaller output?
  • Do you have enough samples to draw any biological conclusions.
  • How would you approach the crossover detection if in contrast to this experiment which used genetically identical F1s being backcrossed to make BC1F1s, an experiment used multiple genetically unique F4 mice that were interbred to produce F5 pups?

This task is not meant to be unpleasant or overly onerous. We only expect you to spend an hour working on it. If you find yourself stuck, feel free to reach out to Wayne for some pointers.

Please send the report to no later than 4pm the day before your interview.