- A-Z Publications
- Annual Proceedings of the South African Statistical Association Conference
- Issue Home
Annual Proceedings of the South African Statistical Association Conference - latest Issue
Congress 1, November 2016
Moment of the discounted compound renewal cash flows with dependence : the use of Farlie-Gumbel-Morgenstern copulaSource: Annual Proceedings of the South African Statistical Association Conference 2016, pp 1 –8 (2016)More Less
In this paper we derive the first moment of the discounted compound renewal cash flows when taking into account dependence between the cash flow and its occurrence time. The dependence structure between the two random variables is defined by a Farlie-Gumbel-Morgenstern copula.
Bayesian multinomial ordinal model to analyse the risk factors and spatial patterns of childhood anaemia in TanzaniaSource: Annual Proceedings of the South African Statistical Association Conference 2016, pp 9 –16 (2016)More Less
Using the 2010 Tanzania Demographic and Health (TDHS) data, we fit a semi-parametric model that combines fixed effects, non-linear terms and spatial components in a unified framework. The spatial effect was modelled using a Markov random field prior. We simultaneously investigate the geographical variation and the risk factor on a polychotomous response of anaemia. We run several Bayesian models via Markov Chain Monte Carlo (MCMC) simulation techniques and the models were compared using Deviance Information Criteria (DIC). We found the risk factors associated with anaemia include place of residence, household poverty, childhood under-nutrition, and infectious diseases. We also evaluated non-linear relations of a mother's age, body mass index, and haemoglobin level. Our method detects spatial effects that may not have been captured by the underlying factors and we produce predictive probability maps. Higher risk were found in the Eastern regions of Tanzania. The output of work highlights highly endemic regions that can assist government agency to target scarce health resource and effective policy direction.
Source: Annual Proceedings of the South African Statistical Association Conference 2016, pp 17 –24 (2016)More Less
In this paper the focus will be to get a better understanding of multilevel analysis and the iterative generalised least squares (IGLS) procedure by making use of matrix methods. The basic philosophy and theoretical concepts of a two-level model will be demonstrated by simulating data with a specified two-level structure. This nested structure is a common phenomenon in especially the social and medical sciences where observations are grouped within certain levels. Explanatory variables will be introduced to accommodate the variation within and between levels. PROC IML in SAS will be used to simulate data and estimate the multilevel model.
Source: Annual Proceedings of the South African Statistical Association Conference 2016, pp 25 –32 (2016)More Less
In this paper multilevel analysis is performed on the most recent data available from the Progress in International Reading Literacy Study (PIRLS) to identify and gain a better understanding of factors that are associated with reading literacy scores of Grade 5 learners in South Africa. The hierarchical structure of the data is accommodated by multilevel analysis where learners are nested within schools. Two random intercept models are considered which allow for different perspectives, one where reading literacy score is entered as a continuous variable and the second where it is entered as a binary response variable. In the latter case the reading literacy score is assessed relative to the international centre point of 500 and the model is used to identify factors that are significantly related to the odds of obtaining a score of at least 500. Covariates considered are grouped into three main categories namely demographics, those used to describe Carroll's model of school learning and those for socio-economic status at both learner- and school level. Results from the two models are compared and differences discussed. The importance of considering different multilevel models to gain a better understanding of relationships is highlighted.
Source: Annual Proceedings of the South African Statistical Association Conference 2016, pp 33 –40 (2016)More Less
Extreme value theory (EVT) is commonly used for evaluating risk in financial returns. In particular, it can be amalgamated with a GARCH model, where the peaks-over-threshold (POT) method is applied to the innovations. However, this GARCH-EVT approach relies on the assumption that the innovations are independent and identically distributed. To relax this assumption, we generalise the POT method to exchangeable sequences. We apply this new approach, with the GARCH filter, to forecast one-day-ahead Value-at-Risk estimates for FTSE100 and ALSI daily returns. It showed significant improvements as compared to some standard methods.
Source: Annual Proceedings of the South African Statistical Association Conference 2016, pp 41 –48 (2016)More Less
In estimating the population mean of a study variable y, we can often use a ratio-type estimator when a related auxiliary variable x, with improved accessibility, is available. In cases where x is qualitative, or may be categorised, and a double sampling plan is used, we may consider a two-phase stratified sampling design. Traditionally, it is assumed that the N variables representing the readings on y are IID within and across strata. In this paper, we relax this assumption to a judgement of exchangeable sequences within each stratum, while still maintaining the assumption of independence across strata. This caters for the existence of dependence structures for within-stratum readings. We propose a methodology for estimating the variance of the ratio estimator under this scenario. Through an example, we show that this method provides a significantly more conservative estimate for the sampling variance, as compared to the standard approach.
Source: Annual Proceedings of the South African Statistical Association Conference 2016, pp 49 –56 (2016)More Less
Conjoint analysis is a technique employed in marketing for unravelling consumer preferences for products. Traditional conjoint analysis making use of linear fixed effects models estimated via Ordinary Least Squares (OLS) or Maximum Likelihood (ML). Consumers tend to differ in their view of products and, as a result, rate them differently. A model that takes account of this inherent heterogeneity in product rating would do better in explaining the variation in consumer preference ratings. Although consumers differ with regards to product choice or rating, there may exist classes of consumers which are not observed during the conjoint study whose rating or choice of products is similar. Hence a model that can account for the individual consumer differences, the correlations between ratings given by consumers and simultaneously uncovers these latent classes would be valuable. A mixed effects model, where the random effects follow a finite mixture of normal distributions (latent class mixed effects model) is proposed as a possible model to achieve this objective. An approximate Maximum Likelihood (ML) method will be used to estimate the model via the Expectation Maximisation (EM) algorithm.
Source: Annual Proceedings of the South African Statistical Association Conference 2016, pp 57 –64 (2016)More Less
A decomposition of the expected prediction error into bias and variance components is useful when investigating the accuracy of a predictor. However, in classification such a decomposition is not as straightforward as in the case of squared-error loss in regression. As a result various definitions of bias and variance for classification can be found in the literature. In this paper these definitions are reviewed and an empirical study of a particular bias-variance decomposition is presented for ensemble classifiers.