Zero conditional mean, and is regression estimating population regression function?












2












$begingroup$


I am relearning econometrics to get a better understanding of it, and to clear the confusions when I had in college.



Using the simple regression model, we have a population model equation as:



$$ y = beta_{0} + beta_{1}x + utag{1}$$



In the SLR assumption 3, we have the zero conditional mean assumption. Are we assuming this statement because in reality, y can take many values given x taking a single value, so that we hope, given x, the expected value of y is center around E[y|x], is this understanding of SLR Assumption 3 correct? This means, if we take the expected value of equation (1) conditioned on x:



$$ E[y|x] = beta_{0} + beta_{1}x + E[u|x]tag{2}$$



Because any deviation can be absorbed by the intercept item, we lose nothing by assuming E[u] = 0. By assuming SLR 3 E[u|x] = E[u] = 0, we are implying that:



$$ E[u|x] =sum{}u P_{u|x}(u) = sum{}udfrac{P_{u,x}(u,x)}{P_{x}(x)}=sum{}udfrac{P_{u}(u)P_{x}(x)}{P_{x}(x)} = sum{}uP_{u}(u) = E[u]tag{3}$$



In order to get the above equation, we are implying that x and u are independent of each other, so we are also implying that from the covariance formula of x and u, we can get the following equation E[ux] = E[u] = 0:



$$ Cov(u, x) = E[ux] - E[u]E[x] = E[ux] - 0 = E[u]E[x] = 0 tag{4}$$



Thus:
$$ E[ux] = 0 tag{5}$$



And, because of this implied uncorrelated relationship between u and x, the equation (2) above can be viewed as when E[u|x] = 0, so we have the population regression function by taking expectation conditioned on x for equation (1), as:



$$ E[y|x] = beta_{0} + beta_{1}x tag{6}$$



This is a linear relationship between x and expected value of y, by the change of 1 unit in x leads to beta1 unit change in y. And the distribution of y is centered at E[y|x].



So my question is that, when we are estimating using OLS, is the sample regression function estimating the population model equation (1) or estimating the population regression function equation (6) and why?



Also, in multiple regression function, we also assume zero conditional mean as:
$$ E[u|x_{1}, x_{2}, x_{3},...,x_{k}] = 0 $$
Here are we saying that u is uncorrelated with the group of (x1,...xk), or can we say that u is uncorrelated with each of xi respectively, for i = 1,...,k?



Thank you for your help and time! Much obliged.










share|cite|improve this question











$endgroup$

















    2












    $begingroup$


    I am relearning econometrics to get a better understanding of it, and to clear the confusions when I had in college.



    Using the simple regression model, we have a population model equation as:



    $$ y = beta_{0} + beta_{1}x + utag{1}$$



    In the SLR assumption 3, we have the zero conditional mean assumption. Are we assuming this statement because in reality, y can take many values given x taking a single value, so that we hope, given x, the expected value of y is center around E[y|x], is this understanding of SLR Assumption 3 correct? This means, if we take the expected value of equation (1) conditioned on x:



    $$ E[y|x] = beta_{0} + beta_{1}x + E[u|x]tag{2}$$



    Because any deviation can be absorbed by the intercept item, we lose nothing by assuming E[u] = 0. By assuming SLR 3 E[u|x] = E[u] = 0, we are implying that:



    $$ E[u|x] =sum{}u P_{u|x}(u) = sum{}udfrac{P_{u,x}(u,x)}{P_{x}(x)}=sum{}udfrac{P_{u}(u)P_{x}(x)}{P_{x}(x)} = sum{}uP_{u}(u) = E[u]tag{3}$$



    In order to get the above equation, we are implying that x and u are independent of each other, so we are also implying that from the covariance formula of x and u, we can get the following equation E[ux] = E[u] = 0:



    $$ Cov(u, x) = E[ux] - E[u]E[x] = E[ux] - 0 = E[u]E[x] = 0 tag{4}$$



    Thus:
    $$ E[ux] = 0 tag{5}$$



    And, because of this implied uncorrelated relationship between u and x, the equation (2) above can be viewed as when E[u|x] = 0, so we have the population regression function by taking expectation conditioned on x for equation (1), as:



    $$ E[y|x] = beta_{0} + beta_{1}x tag{6}$$



    This is a linear relationship between x and expected value of y, by the change of 1 unit in x leads to beta1 unit change in y. And the distribution of y is centered at E[y|x].



    So my question is that, when we are estimating using OLS, is the sample regression function estimating the population model equation (1) or estimating the population regression function equation (6) and why?



    Also, in multiple regression function, we also assume zero conditional mean as:
    $$ E[u|x_{1}, x_{2}, x_{3},...,x_{k}] = 0 $$
    Here are we saying that u is uncorrelated with the group of (x1,...xk), or can we say that u is uncorrelated with each of xi respectively, for i = 1,...,k?



    Thank you for your help and time! Much obliged.










    share|cite|improve this question











    $endgroup$















      2












      2








      2





      $begingroup$


      I am relearning econometrics to get a better understanding of it, and to clear the confusions when I had in college.



      Using the simple regression model, we have a population model equation as:



      $$ y = beta_{0} + beta_{1}x + utag{1}$$



      In the SLR assumption 3, we have the zero conditional mean assumption. Are we assuming this statement because in reality, y can take many values given x taking a single value, so that we hope, given x, the expected value of y is center around E[y|x], is this understanding of SLR Assumption 3 correct? This means, if we take the expected value of equation (1) conditioned on x:



      $$ E[y|x] = beta_{0} + beta_{1}x + E[u|x]tag{2}$$



      Because any deviation can be absorbed by the intercept item, we lose nothing by assuming E[u] = 0. By assuming SLR 3 E[u|x] = E[u] = 0, we are implying that:



      $$ E[u|x] =sum{}u P_{u|x}(u) = sum{}udfrac{P_{u,x}(u,x)}{P_{x}(x)}=sum{}udfrac{P_{u}(u)P_{x}(x)}{P_{x}(x)} = sum{}uP_{u}(u) = E[u]tag{3}$$



      In order to get the above equation, we are implying that x and u are independent of each other, so we are also implying that from the covariance formula of x and u, we can get the following equation E[ux] = E[u] = 0:



      $$ Cov(u, x) = E[ux] - E[u]E[x] = E[ux] - 0 = E[u]E[x] = 0 tag{4}$$



      Thus:
      $$ E[ux] = 0 tag{5}$$



      And, because of this implied uncorrelated relationship between u and x, the equation (2) above can be viewed as when E[u|x] = 0, so we have the population regression function by taking expectation conditioned on x for equation (1), as:



      $$ E[y|x] = beta_{0} + beta_{1}x tag{6}$$



      This is a linear relationship between x and expected value of y, by the change of 1 unit in x leads to beta1 unit change in y. And the distribution of y is centered at E[y|x].



      So my question is that, when we are estimating using OLS, is the sample regression function estimating the population model equation (1) or estimating the population regression function equation (6) and why?



      Also, in multiple regression function, we also assume zero conditional mean as:
      $$ E[u|x_{1}, x_{2}, x_{3},...,x_{k}] = 0 $$
      Here are we saying that u is uncorrelated with the group of (x1,...xk), or can we say that u is uncorrelated with each of xi respectively, for i = 1,...,k?



      Thank you for your help and time! Much obliged.










      share|cite|improve this question











      $endgroup$




      I am relearning econometrics to get a better understanding of it, and to clear the confusions when I had in college.



      Using the simple regression model, we have a population model equation as:



      $$ y = beta_{0} + beta_{1}x + utag{1}$$



      In the SLR assumption 3, we have the zero conditional mean assumption. Are we assuming this statement because in reality, y can take many values given x taking a single value, so that we hope, given x, the expected value of y is center around E[y|x], is this understanding of SLR Assumption 3 correct? This means, if we take the expected value of equation (1) conditioned on x:



      $$ E[y|x] = beta_{0} + beta_{1}x + E[u|x]tag{2}$$



      Because any deviation can be absorbed by the intercept item, we lose nothing by assuming E[u] = 0. By assuming SLR 3 E[u|x] = E[u] = 0, we are implying that:



      $$ E[u|x] =sum{}u P_{u|x}(u) = sum{}udfrac{P_{u,x}(u,x)}{P_{x}(x)}=sum{}udfrac{P_{u}(u)P_{x}(x)}{P_{x}(x)} = sum{}uP_{u}(u) = E[u]tag{3}$$



      In order to get the above equation, we are implying that x and u are independent of each other, so we are also implying that from the covariance formula of x and u, we can get the following equation E[ux] = E[u] = 0:



      $$ Cov(u, x) = E[ux] - E[u]E[x] = E[ux] - 0 = E[u]E[x] = 0 tag{4}$$



      Thus:
      $$ E[ux] = 0 tag{5}$$



      And, because of this implied uncorrelated relationship between u and x, the equation (2) above can be viewed as when E[u|x] = 0, so we have the population regression function by taking expectation conditioned on x for equation (1), as:



      $$ E[y|x] = beta_{0} + beta_{1}x tag{6}$$



      This is a linear relationship between x and expected value of y, by the change of 1 unit in x leads to beta1 unit change in y. And the distribution of y is centered at E[y|x].



      So my question is that, when we are estimating using OLS, is the sample regression function estimating the population model equation (1) or estimating the population regression function equation (6) and why?



      Also, in multiple regression function, we also assume zero conditional mean as:
      $$ E[u|x_{1}, x_{2}, x_{3},...,x_{k}] = 0 $$
      Here are we saying that u is uncorrelated with the group of (x1,...xk), or can we say that u is uncorrelated with each of xi respectively, for i = 1,...,k?



      Thank you for your help and time! Much obliged.







      statistics regression






      share|cite|improve this question















      share|cite|improve this question













      share|cite|improve this question




      share|cite|improve this question








      edited Jan 7 at 21:53







      commentallez-vous

















      asked Jan 7 at 21:24









      commentallez-vouscommentallez-vous

      1908




      1908






















          1 Answer
          1






          active

          oldest

          votes


















          2












          $begingroup$

          The OLS regression estimates the conditional expectation, i.e.,
          $$
          mathbb{E}[y|X=x]=beta_0 + beta_1x,
          $$

          namely, the estimated model is
          $$
          widehat{mathbb{E}[y|X=x]}=hat{y} = hat{beta_0}+hat{beta}_1x.
          $$

          There is no sense in estimating $u$ as $u$ is a random variable.



          Now, let's see what happens if
          $
          mathbb{E}[u|x]
          $

          is not zero. If it constant then you stated correctly that it can be absorbed into the intercept term. Another possibility is that it depends somehow on $X$, i.e.,
          $$
          mathbb{E}[u|X=x] = g(x),
          $$

          so the conditional expectation is
          $$
          mathbb{E}[y|X=x]=beta_0 + beta_1x + g(x),
          $$

          now everything depends on the structure of $g$. If it is linear, then you go back to the original model
          $$
          mathbb{E}[y|X=x]=beta_0 + beta_1x + bx = beta_0+(beta_1+b)x=beta_0+tilde{beta}x.
          $$

          If it has any other parametric structure, then it modifies the model according to its structure, if $g$ is non-parametric or non-measurable then the linear simple model is simply inappropriate in this case. And regarding the independence, this is very strong assumption. All the basic assumptions assume that the variables are uncorrelated or ($y$s are) conditionally (on $x$) independent. It is enough for the Gauss-Markov theorem to apply.






          share|cite|improve this answer









          $endgroup$









          • 1




            $begingroup$
            Thank you Vancak, and after posting this question, I reviewed Gujarati's book compared with Wooldridge's, the graphs in the former one offered me a better view of this zero conditional mean idea. I also read a reply for another question of mine from a member here to prove that zero conditional mean implies error term is uncorrelated with each of the independent variables. Combined with your explanation here, I feel I understood these assumptions much better. Really appreciate it. Thank you for your time and help!
            $endgroup$
            – commentallez-vous
            Jan 9 at 21:41











          Your Answer





          StackExchange.ifUsing("editor", function () {
          return StackExchange.using("mathjaxEditing", function () {
          StackExchange.MarkdownEditor.creationCallbacks.add(function (editor, postfix) {
          StackExchange.mathjaxEditing.prepareWmdForMathJax(editor, postfix, [["$", "$"], ["\\(","\\)"]]);
          });
          });
          }, "mathjax-editing");

          StackExchange.ready(function() {
          var channelOptions = {
          tags: "".split(" "),
          id: "69"
          };
          initTagRenderer("".split(" "), "".split(" "), channelOptions);

          StackExchange.using("externalEditor", function() {
          // Have to fire editor after snippets, if snippets enabled
          if (StackExchange.settings.snippets.snippetsEnabled) {
          StackExchange.using("snippets", function() {
          createEditor();
          });
          }
          else {
          createEditor();
          }
          });

          function createEditor() {
          StackExchange.prepareEditor({
          heartbeatType: 'answer',
          autoActivateHeartbeat: false,
          convertImagesToLinks: true,
          noModals: true,
          showLowRepImageUploadWarning: true,
          reputationToPostImages: 10,
          bindNavPrevention: true,
          postfix: "",
          imageUploader: {
          brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
          contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
          allowUrls: true
          },
          noCode: true, onDemand: true,
          discardSelector: ".discard-answer"
          ,immediatelyShowMarkdownHelp:true
          });


          }
          });














          draft saved

          draft discarded


















          StackExchange.ready(
          function () {
          StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fmath.stackexchange.com%2fquestions%2f3065516%2fzero-conditional-mean-and-is-regression-estimating-population-regression-functi%23new-answer', 'question_page');
          }
          );

          Post as a guest















          Required, but never shown

























          1 Answer
          1






          active

          oldest

          votes








          1 Answer
          1






          active

          oldest

          votes









          active

          oldest

          votes






          active

          oldest

          votes









          2












          $begingroup$

          The OLS regression estimates the conditional expectation, i.e.,
          $$
          mathbb{E}[y|X=x]=beta_0 + beta_1x,
          $$

          namely, the estimated model is
          $$
          widehat{mathbb{E}[y|X=x]}=hat{y} = hat{beta_0}+hat{beta}_1x.
          $$

          There is no sense in estimating $u$ as $u$ is a random variable.



          Now, let's see what happens if
          $
          mathbb{E}[u|x]
          $

          is not zero. If it constant then you stated correctly that it can be absorbed into the intercept term. Another possibility is that it depends somehow on $X$, i.e.,
          $$
          mathbb{E}[u|X=x] = g(x),
          $$

          so the conditional expectation is
          $$
          mathbb{E}[y|X=x]=beta_0 + beta_1x + g(x),
          $$

          now everything depends on the structure of $g$. If it is linear, then you go back to the original model
          $$
          mathbb{E}[y|X=x]=beta_0 + beta_1x + bx = beta_0+(beta_1+b)x=beta_0+tilde{beta}x.
          $$

          If it has any other parametric structure, then it modifies the model according to its structure, if $g$ is non-parametric or non-measurable then the linear simple model is simply inappropriate in this case. And regarding the independence, this is very strong assumption. All the basic assumptions assume that the variables are uncorrelated or ($y$s are) conditionally (on $x$) independent. It is enough for the Gauss-Markov theorem to apply.






          share|cite|improve this answer









          $endgroup$









          • 1




            $begingroup$
            Thank you Vancak, and after posting this question, I reviewed Gujarati's book compared with Wooldridge's, the graphs in the former one offered me a better view of this zero conditional mean idea. I also read a reply for another question of mine from a member here to prove that zero conditional mean implies error term is uncorrelated with each of the independent variables. Combined with your explanation here, I feel I understood these assumptions much better. Really appreciate it. Thank you for your time and help!
            $endgroup$
            – commentallez-vous
            Jan 9 at 21:41
















          2












          $begingroup$

          The OLS regression estimates the conditional expectation, i.e.,
          $$
          mathbb{E}[y|X=x]=beta_0 + beta_1x,
          $$

          namely, the estimated model is
          $$
          widehat{mathbb{E}[y|X=x]}=hat{y} = hat{beta_0}+hat{beta}_1x.
          $$

          There is no sense in estimating $u$ as $u$ is a random variable.



          Now, let's see what happens if
          $
          mathbb{E}[u|x]
          $

          is not zero. If it constant then you stated correctly that it can be absorbed into the intercept term. Another possibility is that it depends somehow on $X$, i.e.,
          $$
          mathbb{E}[u|X=x] = g(x),
          $$

          so the conditional expectation is
          $$
          mathbb{E}[y|X=x]=beta_0 + beta_1x + g(x),
          $$

          now everything depends on the structure of $g$. If it is linear, then you go back to the original model
          $$
          mathbb{E}[y|X=x]=beta_0 + beta_1x + bx = beta_0+(beta_1+b)x=beta_0+tilde{beta}x.
          $$

          If it has any other parametric structure, then it modifies the model according to its structure, if $g$ is non-parametric or non-measurable then the linear simple model is simply inappropriate in this case. And regarding the independence, this is very strong assumption. All the basic assumptions assume that the variables are uncorrelated or ($y$s are) conditionally (on $x$) independent. It is enough for the Gauss-Markov theorem to apply.






          share|cite|improve this answer









          $endgroup$









          • 1




            $begingroup$
            Thank you Vancak, and after posting this question, I reviewed Gujarati's book compared with Wooldridge's, the graphs in the former one offered me a better view of this zero conditional mean idea. I also read a reply for another question of mine from a member here to prove that zero conditional mean implies error term is uncorrelated with each of the independent variables. Combined with your explanation here, I feel I understood these assumptions much better. Really appreciate it. Thank you for your time and help!
            $endgroup$
            – commentallez-vous
            Jan 9 at 21:41














          2












          2








          2





          $begingroup$

          The OLS regression estimates the conditional expectation, i.e.,
          $$
          mathbb{E}[y|X=x]=beta_0 + beta_1x,
          $$

          namely, the estimated model is
          $$
          widehat{mathbb{E}[y|X=x]}=hat{y} = hat{beta_0}+hat{beta}_1x.
          $$

          There is no sense in estimating $u$ as $u$ is a random variable.



          Now, let's see what happens if
          $
          mathbb{E}[u|x]
          $

          is not zero. If it constant then you stated correctly that it can be absorbed into the intercept term. Another possibility is that it depends somehow on $X$, i.e.,
          $$
          mathbb{E}[u|X=x] = g(x),
          $$

          so the conditional expectation is
          $$
          mathbb{E}[y|X=x]=beta_0 + beta_1x + g(x),
          $$

          now everything depends on the structure of $g$. If it is linear, then you go back to the original model
          $$
          mathbb{E}[y|X=x]=beta_0 + beta_1x + bx = beta_0+(beta_1+b)x=beta_0+tilde{beta}x.
          $$

          If it has any other parametric structure, then it modifies the model according to its structure, if $g$ is non-parametric or non-measurable then the linear simple model is simply inappropriate in this case. And regarding the independence, this is very strong assumption. All the basic assumptions assume that the variables are uncorrelated or ($y$s are) conditionally (on $x$) independent. It is enough for the Gauss-Markov theorem to apply.






          share|cite|improve this answer









          $endgroup$



          The OLS regression estimates the conditional expectation, i.e.,
          $$
          mathbb{E}[y|X=x]=beta_0 + beta_1x,
          $$

          namely, the estimated model is
          $$
          widehat{mathbb{E}[y|X=x]}=hat{y} = hat{beta_0}+hat{beta}_1x.
          $$

          There is no sense in estimating $u$ as $u$ is a random variable.



          Now, let's see what happens if
          $
          mathbb{E}[u|x]
          $

          is not zero. If it constant then you stated correctly that it can be absorbed into the intercept term. Another possibility is that it depends somehow on $X$, i.e.,
          $$
          mathbb{E}[u|X=x] = g(x),
          $$

          so the conditional expectation is
          $$
          mathbb{E}[y|X=x]=beta_0 + beta_1x + g(x),
          $$

          now everything depends on the structure of $g$. If it is linear, then you go back to the original model
          $$
          mathbb{E}[y|X=x]=beta_0 + beta_1x + bx = beta_0+(beta_1+b)x=beta_0+tilde{beta}x.
          $$

          If it has any other parametric structure, then it modifies the model according to its structure, if $g$ is non-parametric or non-measurable then the linear simple model is simply inappropriate in this case. And regarding the independence, this is very strong assumption. All the basic assumptions assume that the variables are uncorrelated or ($y$s are) conditionally (on $x$) independent. It is enough for the Gauss-Markov theorem to apply.







          share|cite|improve this answer












          share|cite|improve this answer



          share|cite|improve this answer










          answered Jan 9 at 21:27









          V. VancakV. Vancak

          11k2926




          11k2926








          • 1




            $begingroup$
            Thank you Vancak, and after posting this question, I reviewed Gujarati's book compared with Wooldridge's, the graphs in the former one offered me a better view of this zero conditional mean idea. I also read a reply for another question of mine from a member here to prove that zero conditional mean implies error term is uncorrelated with each of the independent variables. Combined with your explanation here, I feel I understood these assumptions much better. Really appreciate it. Thank you for your time and help!
            $endgroup$
            – commentallez-vous
            Jan 9 at 21:41














          • 1




            $begingroup$
            Thank you Vancak, and after posting this question, I reviewed Gujarati's book compared with Wooldridge's, the graphs in the former one offered me a better view of this zero conditional mean idea. I also read a reply for another question of mine from a member here to prove that zero conditional mean implies error term is uncorrelated with each of the independent variables. Combined with your explanation here, I feel I understood these assumptions much better. Really appreciate it. Thank you for your time and help!
            $endgroup$
            – commentallez-vous
            Jan 9 at 21:41








          1




          1




          $begingroup$
          Thank you Vancak, and after posting this question, I reviewed Gujarati's book compared with Wooldridge's, the graphs in the former one offered me a better view of this zero conditional mean idea. I also read a reply for another question of mine from a member here to prove that zero conditional mean implies error term is uncorrelated with each of the independent variables. Combined with your explanation here, I feel I understood these assumptions much better. Really appreciate it. Thank you for your time and help!
          $endgroup$
          – commentallez-vous
          Jan 9 at 21:41




          $begingroup$
          Thank you Vancak, and after posting this question, I reviewed Gujarati's book compared with Wooldridge's, the graphs in the former one offered me a better view of this zero conditional mean idea. I also read a reply for another question of mine from a member here to prove that zero conditional mean implies error term is uncorrelated with each of the independent variables. Combined with your explanation here, I feel I understood these assumptions much better. Really appreciate it. Thank you for your time and help!
          $endgroup$
          – commentallez-vous
          Jan 9 at 21:41


















          draft saved

          draft discarded




















































          Thanks for contributing an answer to Mathematics Stack Exchange!


          • Please be sure to answer the question. Provide details and share your research!

          But avoid



          • Asking for help, clarification, or responding to other answers.

          • Making statements based on opinion; back them up with references or personal experience.


          Use MathJax to format equations. MathJax reference.


          To learn more, see our tips on writing great answers.




          draft saved


          draft discarded














          StackExchange.ready(
          function () {
          StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fmath.stackexchange.com%2fquestions%2f3065516%2fzero-conditional-mean-and-is-regression-estimating-population-regression-functi%23new-answer', 'question_page');
          }
          );

          Post as a guest















          Required, but never shown





















































          Required, but never shown














          Required, but never shown












          Required, but never shown







          Required, but never shown

































          Required, but never shown














          Required, but never shown












          Required, but never shown







          Required, but never shown







          Popular posts from this blog

          Can a sorcerer learn a 5th-level spell early by creating spell slots using the Font of Magic feature?

          Does disintegrating a polymorphed enemy still kill it after the 2018 errata?

          A Topological Invariant for $pi_3(U(n))$