Extending the XML compressor Exact with lazy updates
LE3 .A278 2011
Master of Science
Extensible Markup Language (XML) is a commonly used format for storing struc- tured data in a cross compatible manner that is also human readable. Unfortunately, XML stores data using a large amount of redundancy; thereby creating les larger than needed. Many XML compressors are specially built to compress XML data. These compressors have the advantage of reducing the size needed to store XML data, but the downside is that a compressed XML le must rst be decompressed in order to make changes to it. This thesis outlines a proposed method for updating compressed XML data while only partially decompressing a le, which should greatly speedup the time taken to make changes to a compressed XML le. The implementa- tion of the compressed XML updater, called Exact, was tested and the results showed that Exact performed faster than Qizx, the best known updater of non-compressed XML documents, by between 348% and 2,730% depending on the document and the number of updates. Therefore, Exact meets our goal of maintaining a strong com- pression ratio, as compared to other XML compressors, while providing fast updates of compressed XML.
The author retains copyright in this thesis. Any substantial copying or any other actions that exceed fair dealing or other exceptions in the Copyright Act require the permission of the author.