Generalizing mutual clusters: A measure of cluster compactness
LE3 .A278 2014
Master of Science
Mathematics and Statistics
Mathematics & Statistics
Previous work (Chipman & Tibshirani, 2006) introduced the idea of a mutual cluster (MC) as a group of points that are closer to each other than to any other outside points. An MC can be characterized in terms of its diameter (the maximum distance within a group) and the nearest outside distance (distance to points outside the group). In this thesis, we study the properties of a mutual cluster and generalize the original de nition of an MC. New computational methods are developed. We start by relaxing the de nition of an MC, using the \decision ratio" ( ), the ratio of the nearest outside distance to the diameter. The decision ratio will give information about the separation between clusters. A simpli cation of the mutual cluster algorithm, classic.MC, is developed to work with a particular group of points, rather than as part of a bottom-up hierarchical clustering. We then propose a new technique to de ne a mutual cluster. This technique is based on quantiles and data depth. It checks whether a given group of points is an MC, and calculates a modi ed decision ratio ( ). This method was introduced to be less sensitive to sample size and outliers. Illustrative examples are used to compare both methods. Lastly, we conduct a designed experiment to study the e ects of: sample size (n), dimension (p) and the separation between cluster means ( ), and to evaluate the performance of the decision ratios & .
The author grants permission to the University Librarian at Acadia University to reproduce, loan or distribute copies of my thesis in microform, paper or electronic formats on a non-profit basis. The author retains the copyright of the thesis.