Information Retrieval

Welcome to COSC431

Semester 1, 2017


Andrew Trotman (Owheo building, room 123A)


If you have ever used a Web search engine, like Google or Yahoo, you will realise how helpful computers can be in finding information, and how frustrating. This paper will tell you we can use computers to find information in unstructured or semi-structured text, and why it is as hard as it is important to do better. We'll start from the basics of IR, such as "what the heck is a word, anyway?" and cover some recent research.

Much of the presentation will be directed reading; some of the key papers are so opaque that we shall also have some lectures.


Second year programming and data structures.


There will be an examination worth 60% and two practical assignments worth a total of 40%.

  • Assignment 1 (20%) - Download here. Due Date: 12th April 2017
  • Assignment 2 (20%) - Download here. New Due Date: 29th May 2017
  • Examination (60%)


  • Date: To be confirmed
  • Time: To be confirmed
  • Room: To be confirmed

Lecture Schedule 2017

Lectures: Wednesdays at 11am - 1pm in Owheo room G34.

1. Course outline and Outstanding Issues in IR
2. Searching
3. Efficiency and Ranking
4. Evaluation
5. Term Conflation
6: ROK: Parsing and XML
7: ROK: Compression slides, audio recording, and notes
8. Relevance Feedback
9. Phrase and Structured Search
10. Distributed Information Retrieval
11. XML-IR and Link Discovery
12. COSC431 Presentations
13. Revision (Come with questions prepared)


Student Administration have asked us to add this note on Plagiarism:
"Students should make sure that all submitted work is their own. Plagiarism is a form of dishonest practice. Plagiarism is defined as copying or paraphrasing another's work, whether intentionally or otherwise, and presenting it as one's own (approved University Council, December 2004). In practice this means plagiarism includes any attempt in any piece of submitted work (e.g. an assignment or test) to present as one's own work the work of another (whether of another student or a published authority). Any student found responsible for plagiarism in any piece of work submitted for assessment shall be subject to the University's dishonest practice regulations which may result in various penalties, including forfeiture of marks for the piece of work submitted, a zero grade for the paper, or in extreme cases exclusion from the University."