Department of Computing
The Department of Computing is offering up to six vacation scholarships for undergraduate students to work on research projects in the Department for 6-8 weeks during January and February 2009. The scholarships carry a $350/week tax free stipend and are open to students who have completed second or third year studies in Computing at Macquarie or any other Australian University.
To apply for a scholarship, please fill in the application form (download MS Word, or PDF) and send it along with your CV and academic transcript to: Vacation Scholarships c/o Dr. Yan Wang, Department of Computing, Macquarie University, NSW 2109.
On your application you should indicate the area of research you are interested in persuing. Listed below are a number of projects that staff members have put forward. Please choose two project titles on your application form.
Applications close: 1st December 2008.
Research Projects 2009
- Building an Oberon Compiler- Tony Sloane and Dom Verity
- Open Source for AnswerFinder - Diego Molla
- Building a Web Service - Diego Molla
- A framework to Test Syntactic Patterns - Diego Molla
- Question Classification - Diego Molla
- Corpus-based Correction of OCR-introduced Spelling Errors - Robert Dale
- Inferring Document Structure - Robert Dale
- An Automated Newsreader - Robert Dale
- The Automatic Generation of Spoken Stock Market Reports - Robert Dale
- Generating News Summaries for SMS Delivery - Robert Dale
- Feedback for Mates - Rolf Schwitter
- Abduction for text interpretation - Rolf Schwitter
- Trusted Web Search - Vijay Varadharajan
- A Trust Management System based on Web Services Technology - Yan Wang
Building an Oberon Compiler
Supervisors: Tony Sloane and Dominic VerityOberon is an imperative programming language in the Pascal tradition, designed by Pascal's originator (and Turing prize winner) Niklaus Wirth. This language will be familiar to students who have studied the unit COMP332 "Programming Languages" here at Macquarie, since it is closely related to the much simpler language Obr which we have been using in that unit to teach the principles of compiler construction.
The overall goal of this project is to build a complete reference implementation of the Oberon 0 variant of this language, as featured in the book Compiler Construction by Prof. Wirth (which is available electronically from http://www.oberon.ethz.ch/).
The implementation will be written using the Scala programming language using the Kiama language processing library that is being developed at Macquarie. Scala is a hybrid object-oriented and functional programming language that runs on the Java virtual machine (http://www.scala-lang.org/).
Required background:
- Very good programming skills in at least 2 programming languages.
- A good understanding of how programming language compilers are structured and implemented, to the level taught in COMP332 for instance.
- Should be motivated to learn new programming languages.
This project will suit a student who is interested in a Kiama-based language processing honours project, since it will provide experience with Scala and Kiama in a simpler setting. Contact Tony Sloane if you would like more information about Kiama.
[back]Summer Projects
Supervisor: Diego Molla
Open Source for AnswerFinder
Currently AnswerFinder, our question answering system, uses proprietary code developed by third parties. The goal of this project is to replace the proprietary code with open-source code. This will involve adapting the interfaces of AnswerFinder to the proprietary modules so that they can handle the new modules. This project is ideal for people with strong programming skills and interest in working for an established software project.
Required background: Good programming skills in Python or C++.
Desired background: Experience with medium to large software projects
Building a Web Service
The goal of this project is to convert AnswerFinder, our question answering system, into a Web Service. The details of how this could be done are available in:
Definition of QA systems as Web services (http://www.linguateca.pt/documentos/QolA/defWebservicesQolA.html)
Required background<: Good programming skills in C++; experience with web technology (e.g. Pass grade in COMP249).
Desired background: Experience with XML programming; experience with Web Services.
A framework to Test Syntactic Patterns
A Masters student has developed a set of question-answering patterns as part of her project. The task of this project is to build a system that tests these patterns on a collection of questions and answer candidates.
Required background: Good programming skills, preferably in C++.
Desired background: Pass grade in SLP148, COMP248 or COMP348.
Question Classification
AnswerFinder currently uses a very simple method to classify questions. The task of this project is to expand the current method by introducing patterns based on syntactic information and/or machine learning techniques.
Good programming skills, preferably in C++; Pass grade in SLP148, COMP248, or COMP348.
Experience in programming in a group.
Summer Projects
Supervisor: Robert Dale
Intelligent Document Processing
Corpus-based Correction of OCR-introduced Spelling Errors
A common way to archive legacy documents is to run them through a scanner to produce a PDF file, to which a searchable text layer is added using optical character recognition (OCR). Unfortunately, OCR is not perfect, so spelling errors are introduced that damage the effectiveness of search techniques.
Using an existing corpus of several thousand scanned academic papers (in the ACL Anthology), this project aims to develop automatic spelling correction techniques that use the corpus itself as a source of evidence for spelling corrections. For example, if the misrecognised string spe11in8 appears in a document, a simple distance metric may find other similar strings, such as spelling, to be much more frequent in the corpus, and on the basis of frequency then choose this as a correction. Of course it gets much more complicated than this, which is why i's interesting ...
Inferring Document Structure
Documents have a physical structure -- typically consisting of pages, columns, and paragraphs -- but they also have a logical structure, consisting of title information, sections, subsections, footnotes, tables and so on. PDF documents are primarily intended for rendering on a screen or a printer, and so are focussed on physical structure; they tend not to contain much information, if any, about the logical structure of the document. But that logical structure can be important for a variety of purposes; for example, knowing the logical structure of a document can assist in information retrieval, information extraction and text summarisation.
The aim of this project is to take a corpus of PDF documents, and to build a system that can automatically extract the logical structure of the document text, so that this can be provided in XML form for a variety of more sophisticated processing stages, or for a more flexible rendering model (for example as a hierarchically unfolding document in a web browser).
Spoken Language Dialog Systems
An Automated Newsreader
Automated newsreaders -- 'talking heads' that read out news stories in synthesized voice -- have been constructed before. These take a textual news source and then use a text-to-speech synthesis engine, in conjuncion with an animated head, to deliver that news in spoken language.
The aim of this project is to build such a system with increased realism, by incorporating both appropriate facial gestures and approptiate intonation in the voice. Watch some newsreaders carefully to see how they use their facial expressions to communicate informaton, and listen to how they use prosody to increase interest in what they are saying. The challenge here is to find techniques that will allow us to derive appropriate audio visual features from a 'flat text' provided as input.
Natural Language Generation
The Automatic Generation of Spoken Stock Market Reports
Stock market data -- information about the prices of stocks and shares -- is a valuable commodity that many people pay mney for in order to receive in a timely fashion. That's ok if you're sitting at your desk with a web browser, or have access to some other internet-enabled device that allows you to access a relevant website. But there are situations where really you'd like to have information provided verbally; and ideally you'd like to have it personalised to your own interests and stock holdings.
The aim of this project is to build a system that interrogates a stock market price database, and in conjunction with a user profile, works out how to construct a text that summarises the relevant information for that user; this text is then delivered via a text-to-speech system, so that the user can access when they are driving or in some other hands-busy eyes-busy context.
Generating News Summaries for SMS Delivery
There are many news services available via the web, but there's a problem when it comes to delivering news to a mobile phone: you only get 160 characters in an SMS message.
The aim of this project is to develop techniques that can analyse a news story and produce a summary that will fit into 160 characters. In many cases the headline will already be short enough, but for that same reason it may not contain much information, so we need to extract more information-rich content from the text of the story, and then find ways to compress it into the available space. This involves using what we might think of as two forms of compression: SMS compression makes use of common abbreviatory conventions to save space, while linguistic compression attempt to analyse the structure of a sentence to determine what parts of that sentence can be dropped without loss in meaning
Feedback for Mates
Supervisor: Rolf Schwitter
In this research project you will design and implement a web-based peer review system that will help to improve the quality of feedback that students get for assessed tutorial tasks. The idea is that student reviewers provide feedback to their colleagues on tutorial questions that require a short essay as an answer.
This system should support the following scenario:
Hercules Summerfield, an overworked lecturer at Macquarie University, specifies online one or more tutorial questions that require an essay-like answer. Together with such a weekly tutorial task, he specifies three deadlines: the time point when the students have to submit their answers; the time point when the student reviewers have to submit their reviews, and finally the time point when the students have to submit their revised version of the answers together with a rejoinder.
Students should be able to answer the tutorial questions online. After the first deadline, the submission of each student is automatically distributed to three (anonymous) student reviewers who are enrolled in the same unit.
All student reviewers provide feedback on the essays online and make suggestions on how these essays can be improved. After the second deadline, the students improve their essays online and explain in a rejoinder how they took the feedback of their colleagues into consideration. After the third deadline, Hercules Summerfield marks the following four items: initial essay, the revised essay, the rejoinder, and the feedback of the reviewers. This marking is done by simply selecting a value from a menu of values for each item. The marks are distributed and stored in a database, note that also the student reviewers get marks for their reviews. Each student should be able to look up his marks at the end of this process.
Of course this peer review system speaks to a database, and it is important that the students and lecturer(s) can login to the system in a secure way.
Required background: COMP249 and good programming skills in Python
Abduction for text interpretation
Supervisor: Rolf Schwitter
Abduction, or inference to the best explanation, is a method of reasoning that starts from a given special case and a known generalisation and then makes a guess that explains the special case, for example:
Given: Tweety flies.
Known: Every bird flies.
Guess: Tweety is a bird.
In this research project you will investigate how a description logic reasoner (RacerPro) can be used to generate explanations for a specific application domain (of your choice). This requires the construction of a description logic knowledge base that combines facts, terminological knowledge and rules (= known generalisations). The idea is to extract facts automatically from unrestricted but domain specific texts using an existing shallow text processing tool and to create the terminological knowledge and the required rules for this domain by hand. The extracted facts will be divided into two parts: a part (a) that the user would like to have explained and a part (b) that the user takes for granted. The facts of part (a) are then transformed into corresponding queries and the abductive reasoning service is asked for explanations. The goal is to implement a first protoype of such an abductive reasoning service that uses RacerPro as a backend.
Required background: COMP348 and good programming skills in Prolog or Python.
Trusted Web Search
Supervisor: Vijay Varadharajan
There is an opportunity for a hands-on and motivated student to join a research team and help to develop secure web search applications. The role would particularly suit someone with a background in Java development who is seeking to implement his/her knowledge in real applications and enhance his/her technical skills.
The key responsibilities will be:
- Development of web search applications with Java technology
- Translation of business requirements into technical solutions
- Design of a Database and Implementation with Microsoft Access
- Parsing structured text files with Java Programming and Storing parsed results in database
- Using Java Programming to search file folders with specific hierarchies and store results in database
- Using socket based programming to collect data over the networks.
To be successful in this role, the student will require:
- Good academic results
- Ability to work collaboratively in a team under the direction of senior technical staff
- An enthusiastic can-do attitude and a desire to take ownership of assigned tasks and issues
- A solid java programming knowledge
- Good database knowledge
A Trust Management System based on Web Services Technology
Supervisor: Yan Wang
Trust management and evaluation are critical in service-oriented environments. A typical approach is to complete the trust evaluation based on the feedback and ratings provided by service clients reflecting the quality of delivered services.
This project will investigate how to build up a trust management architecture based on Web Services technology. It will study and compare the performance of centralized architecture, distributed architecture and the Peer-to-Peer architecture. The project is expected to provide a generic solution of trust service.
In this project, the students will learn the concepts as well as technologies in web services, networks, trust management and trust evaluation.
Skill and knowledge required: XML and Java programming
Exercitationem
Visi ut aliquid ex
Ut enim ad minima veniam, quis nostrum exercitationem ullam corporis suscipit laboriosam, nisi ut aliquid ex ea commodi consequatur.
Visi ut aliquid ex
Quis autem vel eum iure reprehenderit qui in ea voluptate velit esse quam nihil molestiae consequatur, vel illum qui dolorem eum fugiat quo voluptas nulla pariatu
Et harum quidem rerum facilis est et expedita distinctio.

