Skip to Content

COMP348: Document Processing and the Semantic Web

Resources and Support Materials : COMP348

Textbooks

There is no one textbook for the unit, although the first half of the unit at least will lean heavily on what we call 'the NLTK Book': Natural Language Processing with Python: Analyzing Text with the Natural Language Toolkit, by Steven Bird, Ewan Klein, and Edward Loper. This book is published by O'Reilly, but there's also a freely available online version here.

Other readings will be assigned week-by-week in class. These will always either be available in the library, or online.

The following books provide additional information on topics covered during the course, and are either in the library (call numbers are in parentheses) or online via the link provided:

  • Jackson and Moulinier, Natural Language Processing for Online Applications, John Benjamins, ISBN: (Call no. QA76.9.N38 J33 2002)
  • Jurafsky and Martin. Speech and Language Processing, Upper Saddle River, NJ ; Prentice Hall, 2000. xxvi, 934p. ISBN: 0-13-095069-6. (Call no. P98.J87/2000) There is a second edition of this book which, unfortunately, the library appears not to have at this time. We're working on that; in the interim you'll still find the first edition a very useful read.
  • Manning and Schütze. Foundations of Statistical Natural Language Processing, Cambridge, MA ; MIT Press, 1999. xxxvii, 680p. ISBN: 0-262-13360-1. (Call no. P98.5.S83.M36/1999)
  • Manning, Raghavan and Schütze. Introduction to Information Retrieval, Cambridge University Press. 2008. ISBN: 0521865719. You can download the book in PDF format.

Each week, you will be assigned some reading for the unit. This is obligatory: you will not understand the lectures if you do not keep up with the reading, and you will not be able to do the practical exercises if you do not do the reading.

You will also need to access materials on programming in Python. If you want to get a book, look at the list of Python books at python.org. Other online resources are mentioned below.

Online Resources

Below is a list of online resources for the unit. The contents will change as the course progresses. If you find any broken links, let us know. Watch this space!

Past Exams

Python: Syntax

Python: Conventions and Static Typing

Python Software

Semantic Web and RDF

  • A series of articles by Paul Ford on xml.com beginning with "Screenscraping the Senate" provides a useful introduction to some of the technologies involved in building the SW as well as a concrete example.
  • The OWL Guide from the W3C isn't too hard to read and provides a fairly complete overview of the Web Ontology Language OWL.
  • Practical RDF, a book by Shelley Powers is available via Safari Books Online from within MQ campus. (Search for it through the library if the above link doesn't work).

Semantic Web Software