I will estimate the following model
Y(t) = a + bX(t) + e(t)
Y(t) is the GDP per Capita growth in the Democratic Republic of Congo (DRC).
X(t) is the change in the price of copper.
t indexes years.
the sample period is from 1995 to 2013.
e(t) is an error term.
a and b are unknown coefficients.
The null hypothesis is that change in the price of copper is not significantly related to the DRC's growth, even though copper is an abundant resource in the DRC that for long has greatly contributed to the state budget.
Y(t) = a + bX(t) + e(t)
Y(t) is the GDP per Capita growth in the Democratic Republic of Congo (DRC).
X(t) is the change in the price of copper.
t indexes years.
the sample period is from 1995 to 2013.
e(t) is an error term.
a and b are unknown coefficients.
The null hypothesis is that change in the price of copper is not significantly related to the DRC's growth, even though copper is an abundant resource in the DRC that for long has greatly contributed to the state budget.
Copper price returns (X) = [-24.62%, -0.79%, -31.95%, -5.03%, 14.23%, -13.89%, -1.20%, 13.18%, 47.67%, 24.97%, 60.28%, 5.73%, -2.31%, -30.06%, 38.06%, 15.84%, -10.32%, -8.25%]
ones = [1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1]'
growth (Y) = [-7.41%, -8.00%, 0.00%, -8.70%, 24.12%, 0.00%, 35.67%, -16.25%, 11.12%, 10.01%, 9.10%, 12.26%, 10.92%, 3.39%, 6.45%, 3.08%, 11.44%, 7.80%]'
You can run a simple regression under Data, Data Analysis, and regression. What follows here is more a curiosity than a real challenge to estimate the unknown coefficients. However, you are losing nothing knowing what the program does by following its steps.
We know from the introductory course in statistics that the unknown coefficients are stack into a vector of coefficients, beta, which is here
beta = [a, b]'
beta = (X'X)^(-1)X'Y
X is a matrix including all the explanatory variables. Our explanatory variables are X(t) and ones with n by 2 dimension (nx2), which means that (X'X) gives a 2x2 matrix. In excel, select a 2 by 2 area and write in the first cell of the selected area the following
m1 = minverse(mmult(transpose(range of Xs),range of Xs)), and take ctrl+shift+enter
The dimension of X'Y is pxn nx1, giving px1, where p is the number of coefficients. Select two cells in a column, then write
m2 = mmult(transpose(range of Xs),range of Y), and take ctrl+shift+enter
The dimension of (X'X^(-1)X'Y is px1. Therefore select two cells in a column and write in the first cell of the selected area
m3 = mmult(transpose(m1,m2), and take ctrl+shift+enter
If you used the data above, a=0.052971 and b=0.105249.
It seems to exist a positive relationship between change in gdp and change in copper price. To make sure that such relationship is statistically significant we need to estimate the standard error of the estimates.
We first have to estimate the residual standard deviation. The dimension of the error is nx1, which suggests we have to select a range of the same dimension as our data. In the first cell of the select range write the following:
e = (range of y) - mmult(range of Xs,beta), and take ctrl+shift+enter
We also estimate the dependent variable residuals as y - average of y next to the column of e as
w = (range of y) - average(range of y), and take ctrl+shift+enter
We will compute both the sum of residual squares (ssr) and the total sum of squares (sst).
In a cell write to obtain
We also estimate the dependent variable residuals as y - average of y next to the column of e as
w = (range of y) - average(range of y), and take ctrl+shift+enter
We will compute both the sum of residual squares (ssr) and the total sum of squares (sst).
In a cell write to obtain
ssr = mmult(transpose(range of e),range of e)), and take ctrl+shift+enter => 0.24077
In another cell write to obtain
sst = mmult(transpose(range of w),range of w)), and take ctrl+shift+enter => 0.253078
We compute the residual variance as the ratio of ssr to n-p, where n is the number of observations.
In a cell write to obtain
n = count(range of ones) => 18
In another cell write to obtain
In another cell write to obtain
sst = mmult(transpose(range of w),range of w)), and take ctrl+shift+enter => 0.253078
We compute the residual variance as the ratio of ssr to n-p, where n is the number of observations.
In a cell write to obtain
n = count(range of ones) => 18
In another cell write to obtain
var = (ssr/(n-2))
In another cell write to obtain
sigma = var^0.5 = 0.122673
We compute the r-square as 1 minus the ratio of ssr to sst.
In a cell write to obtain
rsq = 1-(ssr/sst) = 0.048605
We compute the adjusted r-square as 1 - ((n-1)/(n-p))(1-rsq). In a cell write to obtain
arsq = 1 - ((n-1)/(n-2))*(1-rsq) = -0.01086
In another cell write to obtain
sigma = var^0.5 = 0.122673
We compute the r-square as 1 minus the ratio of ssr to sst.
In a cell write to obtain
rsq = 1-(ssr/sst) = 0.048605
We compute the adjusted r-square as 1 - ((n-1)/(n-p))(1-rsq). In a cell write to obtain
arsq = 1 - ((n-1)/(n-2))*(1-rsq) = -0.01086
The standard error of a and b are sigma times their corresponding values in m1. The diagonal values in m1 are positive. use = to arrange the two values in a column. So, this is the diag vector. Select two cells in the column next to the estimated coefficients and write in the first cell of the selected range
s.e. = sqrt(mmult(range of diag), var), and take ctrl+shift+enter
The s.e. of a and b are 0.029514 and 0.116412, respectively. So, the t-stat for a and b are 1.8 and 0.9, respectively; far below 1.95.
We conclude that change in price of copper does not significantly impact the gdp growth of the DRC. We need may be to extend our model by including other variables and more data points. However, we have shown that we can estimate the unknown coefficients and their standard errors in MS Excel using the matrix language.
No comments:
Post a Comment