Getting wrong betas when doing OLS regression in R
My first question here. This problem have stolen days from my life. I know, it's not that important, but at the same time: I need to know! I know there are many good formulas for making regression. But when I try to do it using good-old arithmetic just to get the hangs of it, I get ridiculous answers on beta.
Beta vector is supposed to be (X'X)^(-1)X'y (where X is the matrix of regressors and y the vector of answers). I'll give one example (and that it's not suitable for OLS is irrelevant - I just want b:s here):
X <- matrix(1:10)
y <- matrix(2:11)
b <- (t(X) %*% X)^(-1) %*% t(X) %*% y
Which gives b = 1.142857, while summary(lm(y~X)) gives beta = 1 and an intercept of 1. I add a constant to X to get an intercept: X <-cbind(X,1) and the results I get is b = (2.324675,14.5) which doesn't make sense at all. What am I doing wrong here?
r regression b
add a comment |
My first question here. This problem have stolen days from my life. I know, it's not that important, but at the same time: I need to know! I know there are many good formulas for making regression. But when I try to do it using good-old arithmetic just to get the hangs of it, I get ridiculous answers on beta.
Beta vector is supposed to be (X'X)^(-1)X'y (where X is the matrix of regressors and y the vector of answers). I'll give one example (and that it's not suitable for OLS is irrelevant - I just want b:s here):
X <- matrix(1:10)
y <- matrix(2:11)
b <- (t(X) %*% X)^(-1) %*% t(X) %*% y
Which gives b = 1.142857, while summary(lm(y~X)) gives beta = 1 and an intercept of 1. I add a constant to X to get an intercept: X <-cbind(X,1) and the results I get is b = (2.324675,14.5) which doesn't make sense at all. What am I doing wrong here?
r regression b
1
The right syntax isb <- solve(t(X) %*% X) %*% t(X) %*% y
becausesolve
gives the inverse of the matrixt(X)%*%X
– Marco Sandri
Nov 19 '18 at 15:46
add a comment |
My first question here. This problem have stolen days from my life. I know, it's not that important, but at the same time: I need to know! I know there are many good formulas for making regression. But when I try to do it using good-old arithmetic just to get the hangs of it, I get ridiculous answers on beta.
Beta vector is supposed to be (X'X)^(-1)X'y (where X is the matrix of regressors and y the vector of answers). I'll give one example (and that it's not suitable for OLS is irrelevant - I just want b:s here):
X <- matrix(1:10)
y <- matrix(2:11)
b <- (t(X) %*% X)^(-1) %*% t(X) %*% y
Which gives b = 1.142857, while summary(lm(y~X)) gives beta = 1 and an intercept of 1. I add a constant to X to get an intercept: X <-cbind(X,1) and the results I get is b = (2.324675,14.5) which doesn't make sense at all. What am I doing wrong here?
r regression b
My first question here. This problem have stolen days from my life. I know, it's not that important, but at the same time: I need to know! I know there are many good formulas for making regression. But when I try to do it using good-old arithmetic just to get the hangs of it, I get ridiculous answers on beta.
Beta vector is supposed to be (X'X)^(-1)X'y (where X is the matrix of regressors and y the vector of answers). I'll give one example (and that it's not suitable for OLS is irrelevant - I just want b:s here):
X <- matrix(1:10)
y <- matrix(2:11)
b <- (t(X) %*% X)^(-1) %*% t(X) %*% y
Which gives b = 1.142857, while summary(lm(y~X)) gives beta = 1 and an intercept of 1. I add a constant to X to get an intercept: X <-cbind(X,1) and the results I get is b = (2.324675,14.5) which doesn't make sense at all. What am I doing wrong here?
r regression b
r regression b
asked Nov 19 '18 at 15:36
Mondainai
82
82
1
The right syntax isb <- solve(t(X) %*% X) %*% t(X) %*% y
becausesolve
gives the inverse of the matrixt(X)%*%X
– Marco Sandri
Nov 19 '18 at 15:46
add a comment |
1
The right syntax isb <- solve(t(X) %*% X) %*% t(X) %*% y
becausesolve
gives the inverse of the matrixt(X)%*%X
– Marco Sandri
Nov 19 '18 at 15:46
1
1
The right syntax is
b <- solve(t(X) %*% X) %*% t(X) %*% y
because solve
gives the inverse of the matrix t(X)%*%X
– Marco Sandri
Nov 19 '18 at 15:46
The right syntax is
b <- solve(t(X) %*% X) %*% t(X) %*% y
because solve
gives the inverse of the matrix t(X)%*%X
– Marco Sandri
Nov 19 '18 at 15:46
add a comment |
3 Answers
3
active
oldest
votes
There are two problems here. The first is a problem of notation. The power of -1 in the formula actually indicates a matrix inverse. That is calculated with solve
in R and not with ^-1
, which indicates element-wise reciprocals.
Then, you need to create a design matrix that actually contains an intercept.
X <- matrix(1:10)
y <- matrix(2:11)^2
coef(lm(y~X))
#(Intercept) X
# -21 13
X <- cbind(1, X)
solve(t(X) %*% X) %*% t(X) %*% y
# [,1]
#[1,] -21
#[2,] 13
Obviously, you should not actually do this matrix inversion in real world applications (and R's lm
doesn't do it).
add a comment |
The issue is with using ^(-1)
for the inverse. It doesn't work like that for Matrices. solve
is used to get the inverse of a matrix: https://www.statmethods.net/advstats/matrix.html
# use solve
b <- solve(t(X) %*% X) %*% t(X) %*% y
# fit model without intercept
m <- lm(y~-1+X)
summary(m)
# same coefficients
b
m$coefficients
# with intercept
X2 <- cbind(rep(1, 10), X)
b2 <- solve(t(X2) %*% X2) %*% t(X2) %*% y
m2 <- lm(y~+X)
summary(m2)
b2
m2$coefficients
add a comment |
X <- cbind(1, matrix(1:10))
b<-solve(t(X)%*%X)%*%t(X)%*%y
https://www.rdocumentation.org/packages/Matrix/versions/0.3-26/topics/solve.Matrix
add a comment |
Your Answer
StackExchange.ifUsing("editor", function () {
StackExchange.using("externalEditor", function () {
StackExchange.using("snippets", function () {
StackExchange.snippets.init();
});
});
}, "code-snippets");
StackExchange.ready(function() {
var channelOptions = {
tags: "".split(" "),
id: "1"
};
initTagRenderer("".split(" "), "".split(" "), channelOptions);
StackExchange.using("externalEditor", function() {
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled) {
StackExchange.using("snippets", function() {
createEditor();
});
}
else {
createEditor();
}
});
function createEditor() {
StackExchange.prepareEditor({
heartbeatType: 'answer',
autoActivateHeartbeat: false,
convertImagesToLinks: true,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: 10,
bindNavPrevention: true,
postfix: "",
imageUploader: {
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
},
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
});
}
});
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53377988%2fgetting-wrong-betas-when-doing-ols-regression-in-r%23new-answer', 'question_page');
}
);
Post as a guest
Required, but never shown
3 Answers
3
active
oldest
votes
3 Answers
3
active
oldest
votes
active
oldest
votes
active
oldest
votes
There are two problems here. The first is a problem of notation. The power of -1 in the formula actually indicates a matrix inverse. That is calculated with solve
in R and not with ^-1
, which indicates element-wise reciprocals.
Then, you need to create a design matrix that actually contains an intercept.
X <- matrix(1:10)
y <- matrix(2:11)^2
coef(lm(y~X))
#(Intercept) X
# -21 13
X <- cbind(1, X)
solve(t(X) %*% X) %*% t(X) %*% y
# [,1]
#[1,] -21
#[2,] 13
Obviously, you should not actually do this matrix inversion in real world applications (and R's lm
doesn't do it).
add a comment |
There are two problems here. The first is a problem of notation. The power of -1 in the formula actually indicates a matrix inverse. That is calculated with solve
in R and not with ^-1
, which indicates element-wise reciprocals.
Then, you need to create a design matrix that actually contains an intercept.
X <- matrix(1:10)
y <- matrix(2:11)^2
coef(lm(y~X))
#(Intercept) X
# -21 13
X <- cbind(1, X)
solve(t(X) %*% X) %*% t(X) %*% y
# [,1]
#[1,] -21
#[2,] 13
Obviously, you should not actually do this matrix inversion in real world applications (and R's lm
doesn't do it).
add a comment |
There are two problems here. The first is a problem of notation. The power of -1 in the formula actually indicates a matrix inverse. That is calculated with solve
in R and not with ^-1
, which indicates element-wise reciprocals.
Then, you need to create a design matrix that actually contains an intercept.
X <- matrix(1:10)
y <- matrix(2:11)^2
coef(lm(y~X))
#(Intercept) X
# -21 13
X <- cbind(1, X)
solve(t(X) %*% X) %*% t(X) %*% y
# [,1]
#[1,] -21
#[2,] 13
Obviously, you should not actually do this matrix inversion in real world applications (and R's lm
doesn't do it).
There are two problems here. The first is a problem of notation. The power of -1 in the formula actually indicates a matrix inverse. That is calculated with solve
in R and not with ^-1
, which indicates element-wise reciprocals.
Then, you need to create a design matrix that actually contains an intercept.
X <- matrix(1:10)
y <- matrix(2:11)^2
coef(lm(y~X))
#(Intercept) X
# -21 13
X <- cbind(1, X)
solve(t(X) %*% X) %*% t(X) %*% y
# [,1]
#[1,] -21
#[2,] 13
Obviously, you should not actually do this matrix inversion in real world applications (and R's lm
doesn't do it).
answered Nov 19 '18 at 15:47
Roland
99.1k6111182
99.1k6111182
add a comment |
add a comment |
The issue is with using ^(-1)
for the inverse. It doesn't work like that for Matrices. solve
is used to get the inverse of a matrix: https://www.statmethods.net/advstats/matrix.html
# use solve
b <- solve(t(X) %*% X) %*% t(X) %*% y
# fit model without intercept
m <- lm(y~-1+X)
summary(m)
# same coefficients
b
m$coefficients
# with intercept
X2 <- cbind(rep(1, 10), X)
b2 <- solve(t(X2) %*% X2) %*% t(X2) %*% y
m2 <- lm(y~+X)
summary(m2)
b2
m2$coefficients
add a comment |
The issue is with using ^(-1)
for the inverse. It doesn't work like that for Matrices. solve
is used to get the inverse of a matrix: https://www.statmethods.net/advstats/matrix.html
# use solve
b <- solve(t(X) %*% X) %*% t(X) %*% y
# fit model without intercept
m <- lm(y~-1+X)
summary(m)
# same coefficients
b
m$coefficients
# with intercept
X2 <- cbind(rep(1, 10), X)
b2 <- solve(t(X2) %*% X2) %*% t(X2) %*% y
m2 <- lm(y~+X)
summary(m2)
b2
m2$coefficients
add a comment |
The issue is with using ^(-1)
for the inverse. It doesn't work like that for Matrices. solve
is used to get the inverse of a matrix: https://www.statmethods.net/advstats/matrix.html
# use solve
b <- solve(t(X) %*% X) %*% t(X) %*% y
# fit model without intercept
m <- lm(y~-1+X)
summary(m)
# same coefficients
b
m$coefficients
# with intercept
X2 <- cbind(rep(1, 10), X)
b2 <- solve(t(X2) %*% X2) %*% t(X2) %*% y
m2 <- lm(y~+X)
summary(m2)
b2
m2$coefficients
The issue is with using ^(-1)
for the inverse. It doesn't work like that for Matrices. solve
is used to get the inverse of a matrix: https://www.statmethods.net/advstats/matrix.html
# use solve
b <- solve(t(X) %*% X) %*% t(X) %*% y
# fit model without intercept
m <- lm(y~-1+X)
summary(m)
# same coefficients
b
m$coefficients
# with intercept
X2 <- cbind(rep(1, 10), X)
b2 <- solve(t(X2) %*% X2) %*% t(X2) %*% y
m2 <- lm(y~+X)
summary(m2)
b2
m2$coefficients
answered Nov 19 '18 at 15:47
Jonny Phelps
81037
81037
add a comment |
add a comment |
X <- cbind(1, matrix(1:10))
b<-solve(t(X)%*%X)%*%t(X)%*%y
https://www.rdocumentation.org/packages/Matrix/versions/0.3-26/topics/solve.Matrix
add a comment |
X <- cbind(1, matrix(1:10))
b<-solve(t(X)%*%X)%*%t(X)%*%y
https://www.rdocumentation.org/packages/Matrix/versions/0.3-26/topics/solve.Matrix
add a comment |
X <- cbind(1, matrix(1:10))
b<-solve(t(X)%*%X)%*%t(X)%*%y
https://www.rdocumentation.org/packages/Matrix/versions/0.3-26/topics/solve.Matrix
X <- cbind(1, matrix(1:10))
b<-solve(t(X)%*%X)%*%t(X)%*%y
https://www.rdocumentation.org/packages/Matrix/versions/0.3-26/topics/solve.Matrix
answered Nov 19 '18 at 15:49
paoloeusebi
641413
641413
add a comment |
add a comment |
Thanks for contributing an answer to Stack Overflow!
- Please be sure to answer the question. Provide details and share your research!
But avoid …
- Asking for help, clarification, or responding to other answers.
- Making statements based on opinion; back them up with references or personal experience.
To learn more, see our tips on writing great answers.
Some of your past answers have not been well-received, and you're in danger of being blocked from answering.
Please pay close attention to the following guidance:
- Please be sure to answer the question. Provide details and share your research!
But avoid …
- Asking for help, clarification, or responding to other answers.
- Making statements based on opinion; back them up with references or personal experience.
To learn more, see our tips on writing great answers.
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53377988%2fgetting-wrong-betas-when-doing-ols-regression-in-r%23new-answer', 'question_page');
}
);
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
1
The right syntax is
b <- solve(t(X) %*% X) %*% t(X) %*% y
becausesolve
gives the inverse of the matrixt(X)%*%X
– Marco Sandri
Nov 19 '18 at 15:46