 Home
 AZ Publications
 Annual Proceedings of the South African Statistical Association Conference
 Issue Home
Annual Proceedings of the South African Statistical Association Conference  latest Issue
Volumes & issues
Volume 2015, Issue 1, 2015

A new approach to approximating the distribution of aggregate discounted claims
Author Franck AdekambiSource: Annual Proceedings of the South African Statistical Association Conference 2015, pp 1 –8 (2015)More LessWe illustrate how alternating renewal processes can be used for the actuarial modelling of health insurance policies. No previous research has applied the cumulative function and the moment generating function of the discounted value of the aggregate amount of benefit paid out up to the end of the n^{th} sickness period, n = 1,2,3,.... However, from a practical point of view, these two expressions are difficult to evaluate. This research thus utilised an approximation of the discounted value of the aggregate amount of benefit paid out up to the end of the sickness period, and for the case of constant force of interest. The approximation will for example be useful to calculate the insurer's probability of ruin, which is the probability that the discounted value of the aggregate amount of benefit paid out exceeds the premium received and the insurer's initial capital. Erlang distributions with different parameters are used for both the periods of health and of sickness, and illustrations are presented in Tables 1, 2 and 3 for a constant force of interest.

An objective comparison between goodnessoffit tests for exponentiality
Authors: J.S. Allison, L. Santana, N. Smit and I.J.H. VisagieSource: Annual Proceedings of the South African Statistical Association Conference 2015, pp 9 –16 (2015)More LessThe exponential distribution is a popular model both in practice and in theoretical work. As a result, a multitude of tests have been developed for testing the hypothesis that observed data are realised from this distribution. Many of the recently developed tests contain a tuning parameter, usually appearing in a weight function. These tests are often evaluated over a grid of values for this parameter. However, this method does not lend itself to objective comparisons because the power of the test is highly dependent on the value of the tuning parameter. In this paper we compare the performance of tests that contain a datadependent choice of the tuning parameter to other classical tests (which do not contain a tuning parameter). It is found that the tests based on the datadependent choice of the tuning parameter compare favourably to the classical tests.

Evaluating risk in gold prices with generalized hyperbolic and stable distributions
Authors: Knowledge Chinhamu, ChunKai Huang and Delson ChikobvuSource: Annual Proceedings of the South African Statistical Association Conference 2015, pp 17 –24 (2015)More LessRisk management tools such as valueatrisk (VaR) are highly dependent on the underlying distributional assumption and identifying a distribution that best captures all aspects of the given financial data may provide advantages to both investors and risk managers. In this paper, we investigate this possibility by establishing the best generalized hyperbolic distributions to fit gold price returns, while comparisons to stable distributions are also drawn. The adequacy of these distributions is assessed through the AndersonDarling test, the Akaike information criterion, the Bayesian information criterion and backtesting of their respective VaR estimates.

Influence of rightcensoring on some kernelsmoothed hazard rates
Authors: Margaret De Villiers, Dalene Bezuidenhout and Paul J. MostertSource: Annual Proceedings of the South African Statistical Association Conference 2015, pp 25 –32 (2015)More LessThe effect of rightcensoring on nonparametric estimation of the hazard rate is investigated in this paper. Three wellknown and widelyused kernel functions applied to the NelsonAalen estimator of the cumulative hazard rate were used to obtain smoothed hazard rate estimates. A simulation study was performed for the purpose of assessing the performance of the hazard rate estimation. In all these simulations, smoothed hazard rates were obtained while recording the frequency of optimal global bandwidths and assessing the performance by estimating the variance, bias and coverage over event times. An example illustrates some of the simulation results.

Properties of A and Doptimal rowcolumn designs for twocolour cDNA microarray experiments : robustness against missing arrays and efficiency
Authors: Legesse Kassa Debusho, Dibaba Bayisa Gemechu and Linda M. HainesSource: Annual Proceedings of the South African Statistical Association Conference 2015, pp 33 –39 (2015)More LessTwocolour complementary deoxyribonucleic acid (cDNA) microarray experiments are the most important experiments that help scientists to study the expression level of thousands of genes simultaneously. If it is assumed that there is gene specific dye effect in a microarray experiment, then there will be two blocking factors, array and dye. In such cases, the microarray experiments can be considered as rowcolumn designs, with dyes as rows and arrays as columns. Furthermore, the experiments can be described using a linear mixed effects model by taking the arrays as random effects, when comparisons of all possible pairs of treatments are of particular interest. One of the important criteria for a good design is its robustness against a missing observation which may occur due to insufficient resolution, image corruption, or scratches on the slide. This may result in disconnectedness of a design which will lead to loss of precision in estimation and/or of possible comparisons between treatments. The main objective of this paper is to investigate robustness properties of the A and Doptimal rowcolumn designs against one or two missing array(s). The numerical results show that the robustness of optimal designs against missing arrays depends on the unknown parameter, which is a function of the random array variance and the error variance.

Modeling extreme daily temperature using generalized Pareto distribution at Port Elizabeth, South Africa
Authors: Tadele A. Diriba, Legesse Kassa Debusho and Joel BotaiSource: Annual Proceedings of the South African Statistical Association Conference 2015, pp 41 –48 (2015)More LessThe extremes of daily maximum temperature in summer and daily minimum temperature in winter were analyzed using the generalized Pareto distribution (GPD) to the Port Elizabeth weather station data, South Africa. Since extremes in minimum and maximum temperatures series do not follow a normal distribution, the nonparametric methods namely, Kendall's tau test and the Sen's slope estimator were used for the trend analysis. A significant positive trend was observed in the extreme annual minimum temperature. However, the inclusion of a linear trend in the logscale parameter in the GPD model for the minimum daily winter temperature did not produce an improvement in the precision of parameter estimates. The results from the return level analysis show that by the end of twenty first century the extreme summer maximum temperature could be about 5 °C higher than the current in Port Elizabeth whereas the change in the winter minimum temperature will be less severe because the return level results suggest an increase of about 2 °C.

LASSO tuning parameter selection
Authors: LisaAnn Kirkland, Frans Kanfer and Sollie MillardSource: Annual Proceedings of the South African Statistical Association Conference 2015, pp 49 –56 (2015)More LessThe LASSO is a penalized regression method which simultaneously performs shrinkage and variable selection. The output produced by the LASSO consists of a piecewise linear solution path, starting with the null model and ending with the full least squares fit, as the value of a tuning parameter is decreased. The performance of the selected model therefore depends greatly on the choice of this parameter. This paper attempts to provide an overview of methods which are available to select the value of the tuning parameter for either prediction or variable selection purposes. A simulation study provides a comparison of these methods and assesses their performance.

Modelling average minimum daily temperature using extreme value theory with a time varying threshold
Source: Annual Proceedings of the South African Statistical Association Conference 2015, pp 57 –64 (2015)More LessIn this paper we present an application of the Generalized Pareto Distribution (GPD) in the modelling of average minimum daily temperature in South Africa for the period January 2000 to August 2010. A penalized cubic smoothing spline is used as a time varying threshold as well as to cater for seasonality. We then extract excesses (residuals) above the cubic spline and fit a nonparametric mixture model to get a sufficiently high threshold. The data exhibit evidence of shortrange dependence and high seasonality which lead to the declustering of the excesses above the sufficiently high threshold and fit the GPD to cluster maxima. The parameters are estimated using the maximum likelihood method. The estimate of the shape parameter shows that the Weibull family of distributions is appropriate in modelling the upper tail of the distribution of average minimum daily temperature in South Africa. The bootstrap resampling method is used as an assessment tool for uncertainty in the parameter estimation. This study has shown that the use of the penalized cubic smoothing spline as a time varying threshold to time series data which exhibits strong seasonality provides a good fit of the GPD to cluster maxima. This results in accurate estimates of return levels.

The risk performance of a heteroscedastic preliminary test estimator under the reflected normal loss function
Source: Annual Proceedings of the South African Statistical Association Conference 2015, pp 65 –72 (2015)More LessThe problem of heteroscedasticity is commonly encountered in regression models and it is known that, under heteroscedasticity, the Ordinary Least Squares estimator is relatively inefficient. This paper focuses on the risk performance of a preliminary test estimator for regression coefficients, after a preliminary test for heteroscedasticity has been performed. The risk performance under the symmetric and bounded Reflected Normal loss function is derived and it is numerically evaluated by making use of Monte Carlo simulations. From these results it is clear that the relative risk gains of the Twostage Aitken estimator and the preliminary test estimator over the Ordinary Least Squares estimator generally increases with higher levels of heteroscedasticity.

Investmentpolicy surrender prediction with random survival forests
Authors: Peter Smith, Frans Kanfer and Sollie MillardSource: Annual Proceedings of the South African Statistical Association Conference 2015, pp 73 –80 (2015)More LessIn this article we introduce and discuss Random Survival Forests, a modern ensemble method for predicting rightcensored survival data, and present an original application of the model in the prediction of surrenders of investment policies. The model's performance is benchmarked against the Cox model  a semiparametric model that has been the mainstay of survival analysis since its introduction in the early 70s. Predictive performance is measured via an adaptation of the Brier Score for rightcensored data using what is known as Inverse Probability of Censoring Weights. In this application the Random Survival Forest is shown to have superior predictive performance to the Cox model.