Assessing the accuracy of Bayesian Additive Regression Tree credible intervals
LE3 .A278 2016
2016
Chipman, Hugh
Acadia University
Master of Science
Masters
Mathematics and Statistics
Mathematics & Statistics
A common type of supervised learning problem is to use training data to estimate a predictive model for a numeric response. Many supervised learning models such as Bayesian Additive Regression Trees (BART) try to flexibly model the data. This Bayesian “sum of trees” model uses MCMC back fitting to simulate posterior samples. BART also provides credible intervals (CIs) for prediction. This thesis studies the accuracy of BART credible intervals and analyzes various factors’ effects on it. These factors include the sample size, dimension, noise standard deviation, predictors’ correlations, junk variables, type of error distribution, and BART method. Simulation is used to compute CI accuracy with a designed experiment that systematically varies the factors to find their effects. Analysis of experimental results gives conclusions about BART CI accuracy. It is found to depend considerably on sample size and error variance.
The author retains copyright in this thesis. Any substantial copying or any other actions that exceed fair dealing or other exceptions in the Copyright Act require the permission of the author.
https://scholar.acadiau.ca/islandora/object/theses:1430