Final projects

General

The final project is the main assignment of the course. Projects are required to be related in a substantive way to at least one of the central topics of the course. Final projects can be done in groups of 1–3 people; in our experience, groups of 3 lead to the best outcomes, so we encourage you to form a team of that size.

Each project team will be assigned a mentor (a member of the teaching team), who will provide feedback on all their project-related work and generally be available as a resource.

Here is very detailed guidance on the intellectual and practical aspects of projects (for this course, and in NLP more generally).

Submission format

The literature review, experiment protocol, and final paper must use the ACL submission format and abide by all the ACL requirements except where we have specified otherwise. See below for links to Overleaf projects that you can use as starter templates for each project component.

Literature review paper

This is a short paper (≈6 page) summarizing and synthesizing several papers in the area of your final project. As noted above, 8 pages is the maximum allowed length.

Here is an Overleaf template for the literature review. You are not required to use this template, but it is encouraged.

Groups of one should review 5 papers, groups of two should review 7 papers, and groups of three should review 9.

The ideal is to have the same topic for your lit review and final project, but it's possible that you'll discover in the lit review that your topic isn't ideal for you, so you can switch topics (or groups) for the final project; your lit review will be graded on its own terms. Major things to include (the italicized phrases make good section headings):

  1. General problem/task definition: What are these papers trying to solve, and why?
  2. Concise summaries of the articles: Do not simply copy the article text in full. We can read them ourselves. Put in your own words the major contributions of each article.
  3. Compare and contrast: Point out the similarities and differences of the papers. Do they agree with each other? Are results seemingly in conflict? If the papers address different subtasks, how are they related? (If they are not related, then you may have made poor choices for a lit review...). This section is probably the most valuable for the final project, as it can become the basis for a lit review section..
  4. Future work: Make several suggestions for how the work can be extended. Are there open questions to answer? This would presumably include how the papers relate to your final project idea.
  5. References section: The entries should appear alphabetically and give at least full author name(s), year of publication, title, and outlet if applicable (e.g., journal name or proceedings name). Beyond that, we are not picky about the format. Electronic references are fine but need to include the above information in addition to the link.

Experiment protocol

This is a short, structured report designed to help you establish your core experimental framework. 8 pages is the max, but the norm for them to be shorter than that.

Here is an Overleaf template for the experiment protocol. You are not required to use this template, but it is encouraged.

Required pieces:

  1. Hypotheses: A statement of the project's core hypothesis or hypotheses.
  2. Data: A description of the dataset(s) that the project will use for evaluation.
  3. Metrics: A description of the metrics that will form the basis for evaluation. We require at least one of these to be quantitative metrics, but we are very open-minded about which ones you choose. In requiring this, we are not saying that all work in NLU needs to be evaluated quantitatively, but rather just that we think it is a healthy requirement for our course.
  4. Models: A description of the models that you'll be using as baselines, and a preliminary description of the model or models that will be the focus of your investigation. At this early stage, some aspects of these models might not yet be worked out, so preliminary descriptions are fine.
  5. General reasoning: An explanation of how the data and models come together to inform your core hypothesis or hypotheses.
  6. Summary of progress so far: what you have been done, what you still need to do, and any obstacles or concerns that might prevent your project from coming to fruition.
  7. References section: In the same format as for lit review.

Final paper

This paper should adhere to the formal requirements and stylistic expectations for research contributions in NLP.

Unlike for the lit review and experiment protocol, you are required to use one of the following templates for your submission:

We'll provide additional guidance on writing up research in NLP. The course readings include many exceptionally good examples of NLP papers in this format. A selection of exemplary papers from prior years of this course is available here (link restricted to enrolled students).

There are two required paper sections that are special to our course:

  • Known project limitations: For this section, imagine that your reader is a well-intentioned NLP practitioner who is seeking to make use of your data, models, or findings as part of a separate scholarly project, deployed system, or some other kind of real-world intervention. What should such a person know about your work? Especially important here are limitations and biases that might affect this person, their findings, their experiment participants, or the users of their product or service. The idea is that what you say here will be taken into consideration but this well-intentioned user, leading to better outcomes for everyone.
  • Authorship statement: At the end of your paper (after the 'Acknowledgments' section in the template), please include a brief authorship statement, explaining how the individual authors contributed to the project. You are free to include whatever information you deem important to convey. For guidance, see the second page, right column, of this guidance for PNAS authors (p. 12). We are requiring this largely because we think it is a good policy in general. This statement is required even for singly-authored papers, because we want to know whether your project is a collaboration with people outside of the class. Only in extreme cases, and after discussion with the team, would we consider giving separate grades to team members based on this statement.