Filtering compressed XML streams
LE3 .A278 2007
Master of Science
'Information Filtering' is the problem of extracting data we desire from a corpus of data. The task of 'filtering' depends heavily on the format of that corpus. For example, the corpus may be a stationary relational database, flat-file, collection of files, or, in the case of a news feed, may be continually 'streamed' to clients. In this research, we examine the problem of filtering data from a continual ' stream' that has been compressed with an 'online', XML-conscious compressor. We introduce a filtering system that leverages the format of compressed XML streams to provide 'subscription-oriented ' filtering of the data contained in the stream. Additionally, the system provides 'persistence' by efficiently storing the ' compressed' results in relational tables, allowing subscribers to received filtered content even if they are disconnected when filtering occurs. These features make our system useful in applications where XML data is continually streamed such as news-feeds, scientific document notification services or in XML routing applications.
The author grants permission to the University Librarian at Acadia University to reproduce, loan or distribute copies of my thesis in microform, paper or electronic formats on a non-profit basis. The author retains the copyright of the thesis.