of 10

# Polynomial regression model of making cost prediction in mixed cost analysis

Published on: Mar 4, 2016
Published in: Technology      Economy & Finance

#### Transcripts - Polynomial regression model of making cost prediction in mixed cost analysis

• 1. Mathematical Theory and Modeling www.iiste.orgISSN 2224-5804 (Paper) ISSN 2225-0522 (Online)Vol.2, No.2, 2012 Polynomial Regression Model of Making Cost Prediction In Mixed Cost Analysis Isaac, O. Ajao (Corresponding author) Department of Mathematics aand Statistics, The Federal Polytechnic, Ado-Ekiti, PMB 5351, Ado-Ekiti, Ekiti state, Nigeria. Tel: +2348035252017 E-mail: isaac_seyi@yahoo.com Adedeji, A. Abdullahi Department of Mathematics aand Statistics, The Federal Polytechnic, Ado-Ekiti, PMB 5351, Ado-Ekiti, Ekiti state, Nigeria. Tel: +2348062632084 E-mail: anzwers2003@yahoo.com Ismail, I. Raji Department of Mathematics aand Statistics, The Federal Polytechnic, Ado-Ekiti, PMB 5351, Ado-Ekiti, Ekiti state, Nigeria. Tel: +2348029023836 E-mail: rajimaths@yahoo.comAbstractRegression analysis is used across business fields for tasks as diverse as systematic risk estimation,production and operations management, and statistical inference. This paper presents the cubic polynomialleast square regression as a robust alternative method of making cost prediction in business rather than theusual linear regression.The study reveals that polynomial regression is a better alternative with a very highcoefficient of determination.Keywords: Polynomial regression, linear regression, high-low method, cost prediction, mixed cost.1. IntroductionCurrent practice in teaching regression analysis relies on the investigation of data sets for users withtechniques that allow description and inference. There are many alternatives, however, for actual learnercomputation of regression coefficients and summary statistics. Kmenta (1971) presents a computationaldesign that allows users to complete the calculations with only a pencil and paper. Brigham (1968) suggeststhat learners might simply construct a scatter plot and a ruler to visually approximate the regression line.Gujarati (2009) recommends the use of statistical packages which are now easily accessible to users onmainframe and micro computers (Mundrake, G.A., & Brown, B.J. (1989)).Mixed costs have both a fixed portion and a variable portion. There are handful of methods used bymanagers to break mixed costs in the two manageable components - fixed and variable costs. The processof breaking mixed costs into fixed and variable portions allow us to use the costs to predict and plan for thefuture since we have a good insight on how these costs behave at various activity levels. We often call theprocess of separating mixed cost into fixed and variable component, cost estimation. The methods 14
• 2. Mathematical Theory and Modeling www.iiste.orgISSN 2224-5804 (Paper) ISSN 2225-0522 (Online)Vol.2, No.2, 2012commonly used are the Scatter graph, High-low method, and the Ordinary least square linear regression.The goal of cost estimation is to determine the amount of fixed and variable costs so that a cost equationcan be used to predict future costs.2. Data and methodThe high-low method uses the highest and the lowest activity levels over a period of time to estimate theportion of a mixed cost that is variable and portion that is fixed. Because it uses only the high and lowactivity levels to calculate the variable and fixed costs, it may be misleading if the high and low activitylevels are not repreentative of the normal activity. The high-low method is most accurate when the high andlow levels of activity are representation of the majority of the points. y2  y1Variable cost per unit (b) = x2  x1Where y2 = the total cost at highest level of activity y1 = the total cost at lowest level of activity x2 = are the number of units at highest level of activity; and x1 = are the number of units at highest level of activityIn other words, variable cost per unit is equal to the slope of the cost level line (i.e. change in total cost /change in number of units produced). Total fixed cost (a) = y2  bx2  y1  bx1The high-low method can be quite misleading. The reason is that cost data are rarely linear and inferencesare based on only two observations, either of which could be statistical anomaly or outlier. The goal of leastsquares is to define a line so that it fits through a set of points on a graph. Where the cummulative sum ofsquared distance between the points and the line is minimized, hence the name “least squares”.2.2 Polynomial Regression modelIn statistics, polynomial regression is a form of linear regression in which the relationship between theindependent variable x and the dependent variable y is modeled as an nth order polynomial. Polynomialregression fits a nonlinear relationship between the value of x and the corresponding conditional mean ofy, denoted as ( y x) ( Fan, Jianqing (1996)) and (Magee, Lonnie (1998)). Although polynomial fits anon linear model to the data, as statistical estimation problem it is linear, in the sense that the regression 15
• 3. Mathematical Theory and Modeling www.iiste.orgISSN 2224-5804 (Paper) ISSN 2225-0522 (Online)Vol.2, No.2, 2012function ( y x) is linear in the unknown parameters that are estimated from the data.2.3 The modelyi  0  1 xi  2 xi 2 + ei i  1, 2,...n. (i)Matematically a parabola is represented by the equation (i), also known as quadratic function, or moregenerally, a second-degree polynomial in the variable x, the highest power of of x represents the degree ofthe polynomial. If x3 were added to the preceeding function (Gujarati, 2009) and (Studenmund, A.H., &Cassidy, H.J. (1987)), it would be a third-degree polynomial, and so on.The stochastic version of equation (i) may be written as yi  0  1 xi  2 xi 2 + 3 xi 3 + ei i  1, 2,...n (ii)Which is called a second-degree polynomial regressionThe general kth degree polynomial regression is written as:yi  0  1 xi  2 xi 2 +. . .+ k xi k + ei i  1, 2,...n where 0 ,  1 ,  k are the parameters of the model, i is a random error term.3. Data Presentation and AnalysisAll analyses were done using MINITAB 11. The scattergram in fig(i) suggests the type of regressionmodel that will fit the data in the table above. From this figure it is clear that the relationship between totalcost and output resembles the elongated S-curve. It is noticed that the total cost curve first increasesgradually and then rapidly, as predicted by the celebrated law of diminishing returns. This S-shape of thetotal cost curve can be captured by the following cubic or third-degree polynomial: yi  0  1 xi  2 xi 2 + 3 xi 3 + ei i  1, 2,...nWhere y = total cost and x = output3.1 Using the High-Low method 2 000 000  500 000Variable cost per unit (slope) =  13.04 per unit , that is N 13.04 per unit 175 000  60 000TC = FC + VC (X) 16
• 4. Mathematical Theory and Modeling www.iiste.orgISSN 2224-5804 (Paper) ISSN 2225-0522 (Online)Vol.2, No.2, 2012Where X = number of unitsUsing: Total cost (TC) = N 2 000 000Variable cost per unit (VC) = N 13.04 and X = 175 000To obtain total fixed cost (FC)N2 000 000 = FC + N 13.04 (175 000)FC =N 2 000 000 – N2 282 000 = - N 282 000.The line of best fit from the above equations becomes: TC = - N 282 000 + N 13.04 (X) (vi)The negative amount of fixed costs is not realistic and leads me to believe that either the total costs at eitherthe high point or at the low point are not representative. The high low method of determining the fixed andvariable portions of a mixed cost relies on only two sets of data: the costs at the highest level of activity,and the costs at the lowest level of activity. If either set of data is flawed, the calculation can result in anunreasonable, negative amount of fixed cost. It is possible that at the highest point of activity the costs wereout of line from the normal relationship—referred to as an outlier.4. Discussion of ResultsThe R-Square value is a statistical calculation that characterizes how well a particular line fits a set of data.As a general rule, the closer R2 is to 1.00 the better; as this would represent a perfect fit where every pointfalls exactly on the resulting line. The models with the lowest P-value and highest R2 which are 0.0000895and 0.874 are the linear and polynomial cubic regression models respectively (table 4).The negative amount of fixed costs is not realistic and leads me to believe that either the total costs at eitherthe high point or at the low point are not representative. The high low method of determining the fixed andvariable portions of a mixed cost relies on only two sets of data: the costs at the highest level of activity,and the costs at the lowest level of activity. If either set of data is flawed, the calculation can result in anunreasonable, negative amount of fixed cost. It is possible that at the highest point of activity the costs wereout of line from the normal relationship—referred to as an outlier. All these are indications of it’s crude andunscientific nature.5. Conclusion and RecommendationBased on the results of the analyses it can be concluded that Polynomial regression model is better thanthe conventional Linear regression and High-Low methods, especially when analysing data relating to costand production functions.It is obvious that Linear and Quadratic models are not too bad for prediction with respect to the data used inthis research paper, but the Cubic polynomial regression is better. It is therefore recommended that data 17
• 5. Mathematical Theory and Modeling www.iiste.orgISSN 2224-5804 (Paper) ISSN 2225-0522 (Online)Vol.2, No.2, 2012analysts should endeavour to always plot a simple scatter diagram before using any regression model inorder to know the type of relationship that exists between the variable of interest.ReferencesBrigham, E.F. (1986). Fundamental of financial management (4th ed.). Chicago: Dryden Press.Fan, Jianqing (1996). "1.1 From linear regression to nonlinear regression". Local Polynomial Modellingand Its Applications. Monographs on Statistics and Applied Probability. Chapman & Hall/CRCGujarati, D.N. and Porter, D.C. (2009). Basic Econometrics. New York: McGraw-Hall.http://www.studyzone.org/testprep/math4/d/linegraph4l.cfm: Data on Monthly unit production and theassociated costsKmenta, J. (1971). Elements of econometrics. New York: MacmillanMagee, Lonnie (1998). "Non-local Behavior in Polynomial Regressions". The American Statistician(American Statistical Association) 52 (1): 20–22.Mundrake, G.A., & Brown, B.J. (1989). Applicacation of microcomputer software to university level courseinstruction. Journal of Education for Business, 64(3), 124-128.Stein, S.H. (1990). Understanding Regression Analysis. Journal of Education for Business, 65(6) 264-269.Studenmund, A.H., & Cassidy, H.J. (1987), Using Econometric: A practical guide. Boston: Little, Brown.AppendixTable 1: Monthly unit production and the associated costs (sorted from low to high) months Units (x) Cost (y) Oct 60 000 N 500 000 Nov 65 000 N 940 000 Mar 75 000 N 840 000 Sept 80 000 N 910 000 18
• 6. Mathematical Theory and Modeling www.iiste.orgISSN 2224-5804 (Paper) ISSN 2225-0522 (Online)Vol.2, No.2, 2012 Feb 90 000 N 1 100 000 Dec 95 000 N 1 500 000 Jan 100 000 N 1 250 000 Aug 115 000 N 1 400 000 Apr 120 000 N 1 400 000 Jun 130 000 N 1 200 000 May 140 000 N 1 500 000 Jul 175 000 N2 000 000Fig.(i): The curve of the total cost 19
• 7. Mathematical Theory and Modeling www.iiste.orgISSN 2224-5804 (Paper) ISSN 2225-0522 (Online)Vol.2, No.2, 2012 The total cost curve 2000000 1500000 cost 1000000 500000 60000 110000 160000 unitsTable (2): Regression (Linear)The regression equation isy = 138533 + 10.3 x (iii)Predictor Coef StDev T PConstant 138533 178518 0.78 0.456x 10.343 1.643 6.30 0.000S = 184068 R-Sq = 79.9% R-Sq(adj) = 77.8%Analysis of VarianceSource DF SS MS F PRegression 1 1.34336E+12 1.34336E+12 39.65 0.000Error 10 3.38811E+11 33881051933Total 11 1.68217E+12 20
• 8. Mathematical Theory and Modeling www.iiste.orgISSN 2224-5804 (Paper) ISSN 2225-0522 (Online)Vol.2, No.2, 2012Fig. (ii): Plot of the Linear regression model linear regression model for total cost Y = 138533 + 10.3435X R-Sq = 0.799 2500000 2000000 1500000 cost 1000000 500000 Regression 95% CI 95% PI 0 60000 110000 160000 unitsTable (3): Polynomial Regression (Quadratic)Y = -136015 + 15.6406X - 2.33E-05X**2 (iv)R-Sq = 0.804Analysis of VarianceSOURCE DF SS MS F PRegression 2 1.35E+12 6.76E+11 18.4624 6.53E-04Error 9 3.30E+11 3.66E+10Total 11 1.68E+12SOURCE DF Seq SS F PLinear 1 1.34E+12 39.6492 8.95E-05Quadratic 1 9.15E+09 0.249846 0.629176 21
• 9. Mathematical Theory and Modeling www.iiste.orgISSN 2224-5804 (Paper) ISSN 2225-0522 (Online)Vol.2, No.2, 2012Fig. (iii): Plot of the Quadratic regression model Quadratic regression model for total cost Y = -136015 + 15.6406X - 2.33E-05X**2 R-Sq = 0.804 2500000 2000000 1500000 cost 1000000 500000 Regression 95% CI 95% PI 0 60000 110000 160000 unitsTable (4): Polynomial Regression (Cubic)Y = -3888396 + 125.375X - 1.02E-03X**2 + 2.84E-09X**3 (v)R-Sq = 0.874Analysis of VarianceSOURCE DF SS MS F PRegression 3 1.47E+12 4.90E+11 18.5547 5.82E-04Error 8 2.11E+11 2.64E+10Total 11 1.68E+12SOURCE DF Seq SS F PLinear 1 1.34E+12 39.6492 8.95E-05Quadratic 1 9.15E+09 0.249846 0.629176Cubic 1 1.18E+11 4.47643 6.73E-02Fig. (iv): Plot of the Cubic regression model 22
• 10. Mathematical Theory and Modeling www.iiste.orgISSN 2224-5804 (Paper) ISSN 2225-0522 (Online)Vol.2, No.2, 2012 Cubic regression model for total cost Y = -3888396 + 125.375X - 1.02E-03X**2 + 2.84E-09X**3 R-Sq = 0.874 2500000 2000000 1500000 cost 1000000 500000 Regression 95% CI 95% PI 0 60000 110000 160000 units 23