Project Writeup Instructions

Your reports should be focused on describing what you did and any relevant background. The purpose of our projects is to attempt to reproduce the findings in a published manuscript. You do not need to restate any of the methods from the paper itself; only what you yourself did in pursuit of reproducibility. The introduction should include a brief discussion (not more than a paragraph) summarizing the premise of the original study, just to give the reader enough context to understand the results.

Projects are due by adding and pushing your report document to your github repos prior to the start of class on the day the project is due! The project will be discussed on the due date, so no late assignments can be accepted. If you have trouble pushing the document, you may email it to the instructor and your TA instead.


How do you assign grades in this class?

I take both overall group and individual performance into account when assigning grades at the end of the semester. The feedback we give is intended to help you improve your writing, critical thinking, and biological/bioinformatic reasoning skills. We therefore want you to focus more on the content of the feedback than the grades of each project. For this reason, our feedback will specify the range of grade you would receive at the end of the course based on the quality of each report.

The only guaranteed ways to get a lower grade is to not follow the instructions in the project descriptions, or not incorporating feedback you received on previous projects.

I’m having trouble with my section and I don’t want my group’s grade to suffer. What happens in this case?

We track which team member played which role on each project. Please focus only on the sections of the report that correspond to your role and not that of other roles. This is both out of fairness and also because it would prevent your TAs and me from giving feedback on each of your contributions. It is fine for other group members to go over and revise the overall report for consistency and formatting reasons, but the content (writing, analysis, plots, etc) of each section should be completed primarily by the corresponding role.

For report sections requiring input from several roles, add content specific to your role as well as you can. It ok to write the report using the precomputed results provided if needed. It is always better to submit something that is incomplete than nothing, and you may consider commenting on the difficulties you had in the report if you were unable to complete your section.

What happens if we don’t get results like those in the paper?

We are not assessing your reports based on whether you replicate the results or not! We are looking for your process. If your analysis results look nothing like what was reported in the paper, but you describe those results accurately and reason about why they were different, that is all I’m looking for. There is never one “right way” to do these studies, and in any case your results will almost never match those reported precisely. Use your reasoning powers to explain your results, whatever they are.

How should we format our reports?

The report should be roughly similar to that of a published paper, with appropriate section headings, prose, and figures/tables interspersed throughout as appropriate. In the past, groups have used Word, Google Docs, etc as the word processing tool. Be sure to review the writeup instructions below to get an idea of what specifically we will be looking for.

Report Guidelines

Your reports will be assessed in six areas:

You will be provided with detailed feedback on each of your reports with specific comments on each of these areas. Your final grade will be assigned based on how well you respond to the comments for your projects overall.

Report Sections Description

Each of these sections should be included in your report. The Lead author of each section is the indicated role, but other roles may need to contribute text to the section depending on the project. The lead role should coordinate the collection and organization of the materials in their section.


Lead author: shared

  • What is the biological background of the study?
  • Why was the study performed?
  • Why did the authors use the bioinformatic techniques they did?


Lead author: Data Curator

The data section should be used to describe the data as completely as possible. Note typical manuscripts will not have a dedicated Data section; this information will usually be included in the Methods.

  • Data Description, e.g.:

    • Which instrument was used?

    • What protocol was used to prepare the samples?

    • For which genome was the data generated?

    • How many samples of which types?

    • For arrays, what type of microarray was used?

    • For sequencing datasets:

      • What was the average library size, i.e. number of reads?
      • How long are the reads?
      • Are the reads single or paired end?
  • What was the source of the data? Include references, links to the public repositories, etc.

  • Data quality control:

    • How was the data assess to be of high quality?
    • Were any samples eliminated due to low quality?
    • Were there any sources of error or contamination detected?


Lead author: Programmer

The methods section should concisely describe which steps were taken in the analysis of the data.

  • How was the data normalized?
  • How were outliers detected and removed?
  • What summarization method was used, if necessary?
  • Overview: Explain the rationale for the method and tools selected to perform each step in the analysis
  • Briefly describe the algorithms used to analyze the data, with graphical illustrations if necessary
  • Describe the specific steps, software versions, and parameters to those software packages that were used to process the data
  • How was the analysis run? How long did it take? What computational resources were required?


Lead author: Analyst

The results section presents the primary findings of the study. This should generally be a simple description of the results, and all discussion and interpretation should be reserved for the Discussion section.

  • Results from each step in the methods section
  • If something failed, explain why you think this happened and suggest alternatives or fixes
  • Figures and/or tables describing your results, with descriptive captions


Lead author: shared

Discuss and interpret the results in the larger biological context of the study. Begin with a brief restatement of the primary results, followed by any interpretations and conclusions drawn.

  • Briefly summarize the overall method and main findings
  • What are the implications of your main findings?
  • What biological interpretation do the findings suggest?
  • Were you able to reproduce the result from the original paper? If not, why not?


Lead author: shared

State the overall conclusions the reader should draw from the study. Note typical manuscripts will not have a dedicated Conclusion section; this information will usually be found at the end of the Discussion.

  • Concisely state the overall conclusion the reader should draw from the analysis
  • Describe any specific challenges or problems you encountered in the project, how you overcame them, and for those that you were unable to solve, what more would you need to solve them?


Lead author: shared

List any publications cited