How is the union bound applied in this proof?












0












$begingroup$


From Understanding Machine Learning: From theory to algorithms:



How explicitly is the union bound used in the proof of this theorem to get the result in the red box below?



enter image description here










share|cite|improve this question









$endgroup$

















    0












    $begingroup$


    From Understanding Machine Learning: From theory to algorithms:



    How explicitly is the union bound used in the proof of this theorem to get the result in the red box below?



    enter image description here










    share|cite|improve this question









    $endgroup$















      0












      0








      0





      $begingroup$


      From Understanding Machine Learning: From theory to algorithms:



      How explicitly is the union bound used in the proof of this theorem to get the result in the red box below?



      enter image description here










      share|cite|improve this question









      $endgroup$




      From Understanding Machine Learning: From theory to algorithms:



      How explicitly is the union bound used in the proof of this theorem to get the result in the red box below?



      enter image description here







      probability-theory machine-learning






      share|cite|improve this question













      share|cite|improve this question











      share|cite|improve this question




      share|cite|improve this question










      asked Jan 19 at 19:46









      Oliver GOliver G

      1,4581632




      1,4581632






















          1 Answer
          1






          active

          oldest

          votes


















          0












          $begingroup$

          Let your events be $A_n = { |L_D(h) - L_S(h)| < epsilon_n$ holds for all $h in mathcal{H}_n }$.



          $P(A_n^c) < delta_n $ as per the previous paragraph, where I am using $c$ to denote complement. Now, we are looking for the event that all $A_n$'s happen, i.e. for the $P(bigcap_{i=1}^n A_n) = 1 - P(bigcup_{i=1}^n A_n^c)$. Now, using union bound on that last term we get $P(bigcup_{i=1}^n A_n^c) leq sum_i P(A_i^c) leq sum_i delta_i$, so $P(bigcap_{i=1}^n A_n)$ is at least $1 - sum_i delta_i$.





          Adding the elements of the set as per OPs request: the theorem says that the space we are working in is $Omega = mathcal{D}^n$, and we pick some $S$ in it. So, more formally, we have
          $A_n = { S in mathcal{D}^n: |L_D(h) - L_S(h)| < epsilon_n$ holds for all $h in mathcal{H}_n }$.






          share|cite|improve this answer











          $endgroup$













          • $begingroup$
            I'm a little confused on your notation of $A_n$, what is an element of that set?
            $endgroup$
            – Oliver G
            Jan 19 at 21:39












          • $begingroup$
            What does $S in D^n$ mean? I've never seen the notation of a set of samples being an element of a distribution. The notation $S text{~} D^n$ is what is in the book and means the $n$ elements of the set $S$ are chosen according to distribution $D$.
            $endgroup$
            – Oliver G
            Jan 19 at 21:50










          • $begingroup$
            And what do you mean by $Omega = D^n$? Because to me that reads: the sample space is the distribution function that maps events from a sample space to $[0,1]$.
            $endgroup$
            – Oliver G
            Jan 19 at 22:00






          • 1




            $begingroup$
            That is what I thought but it is pretty hard to try to infer the content of a book from one screenshot... Either way, then the sample space is whatever domain $S$ belongs to to the $n$. If you upload the relevant screenshots for the definitions it might be easier for people to help you.
            $endgroup$
            – E-A
            Jan 20 at 2:02











          Your Answer





          StackExchange.ifUsing("editor", function () {
          return StackExchange.using("mathjaxEditing", function () {
          StackExchange.MarkdownEditor.creationCallbacks.add(function (editor, postfix) {
          StackExchange.mathjaxEditing.prepareWmdForMathJax(editor, postfix, [["$", "$"], ["\\(","\\)"]]);
          });
          });
          }, "mathjax-editing");

          StackExchange.ready(function() {
          var channelOptions = {
          tags: "".split(" "),
          id: "69"
          };
          initTagRenderer("".split(" "), "".split(" "), channelOptions);

          StackExchange.using("externalEditor", function() {
          // Have to fire editor after snippets, if snippets enabled
          if (StackExchange.settings.snippets.snippetsEnabled) {
          StackExchange.using("snippets", function() {
          createEditor();
          });
          }
          else {
          createEditor();
          }
          });

          function createEditor() {
          StackExchange.prepareEditor({
          heartbeatType: 'answer',
          autoActivateHeartbeat: false,
          convertImagesToLinks: true,
          noModals: true,
          showLowRepImageUploadWarning: true,
          reputationToPostImages: 10,
          bindNavPrevention: true,
          postfix: "",
          imageUploader: {
          brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
          contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
          allowUrls: true
          },
          noCode: true, onDemand: true,
          discardSelector: ".discard-answer"
          ,immediatelyShowMarkdownHelp:true
          });


          }
          });














          draft saved

          draft discarded


















          StackExchange.ready(
          function () {
          StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fmath.stackexchange.com%2fquestions%2f3079732%2fhow-is-the-union-bound-applied-in-this-proof%23new-answer', 'question_page');
          }
          );

          Post as a guest















          Required, but never shown

























          1 Answer
          1






          active

          oldest

          votes








          1 Answer
          1






          active

          oldest

          votes









          active

          oldest

          votes






          active

          oldest

          votes









          0












          $begingroup$

          Let your events be $A_n = { |L_D(h) - L_S(h)| < epsilon_n$ holds for all $h in mathcal{H}_n }$.



          $P(A_n^c) < delta_n $ as per the previous paragraph, where I am using $c$ to denote complement. Now, we are looking for the event that all $A_n$'s happen, i.e. for the $P(bigcap_{i=1}^n A_n) = 1 - P(bigcup_{i=1}^n A_n^c)$. Now, using union bound on that last term we get $P(bigcup_{i=1}^n A_n^c) leq sum_i P(A_i^c) leq sum_i delta_i$, so $P(bigcap_{i=1}^n A_n)$ is at least $1 - sum_i delta_i$.





          Adding the elements of the set as per OPs request: the theorem says that the space we are working in is $Omega = mathcal{D}^n$, and we pick some $S$ in it. So, more formally, we have
          $A_n = { S in mathcal{D}^n: |L_D(h) - L_S(h)| < epsilon_n$ holds for all $h in mathcal{H}_n }$.






          share|cite|improve this answer











          $endgroup$













          • $begingroup$
            I'm a little confused on your notation of $A_n$, what is an element of that set?
            $endgroup$
            – Oliver G
            Jan 19 at 21:39












          • $begingroup$
            What does $S in D^n$ mean? I've never seen the notation of a set of samples being an element of a distribution. The notation $S text{~} D^n$ is what is in the book and means the $n$ elements of the set $S$ are chosen according to distribution $D$.
            $endgroup$
            – Oliver G
            Jan 19 at 21:50










          • $begingroup$
            And what do you mean by $Omega = D^n$? Because to me that reads: the sample space is the distribution function that maps events from a sample space to $[0,1]$.
            $endgroup$
            – Oliver G
            Jan 19 at 22:00






          • 1




            $begingroup$
            That is what I thought but it is pretty hard to try to infer the content of a book from one screenshot... Either way, then the sample space is whatever domain $S$ belongs to to the $n$. If you upload the relevant screenshots for the definitions it might be easier for people to help you.
            $endgroup$
            – E-A
            Jan 20 at 2:02
















          0












          $begingroup$

          Let your events be $A_n = { |L_D(h) - L_S(h)| < epsilon_n$ holds for all $h in mathcal{H}_n }$.



          $P(A_n^c) < delta_n $ as per the previous paragraph, where I am using $c$ to denote complement. Now, we are looking for the event that all $A_n$'s happen, i.e. for the $P(bigcap_{i=1}^n A_n) = 1 - P(bigcup_{i=1}^n A_n^c)$. Now, using union bound on that last term we get $P(bigcup_{i=1}^n A_n^c) leq sum_i P(A_i^c) leq sum_i delta_i$, so $P(bigcap_{i=1}^n A_n)$ is at least $1 - sum_i delta_i$.





          Adding the elements of the set as per OPs request: the theorem says that the space we are working in is $Omega = mathcal{D}^n$, and we pick some $S$ in it. So, more formally, we have
          $A_n = { S in mathcal{D}^n: |L_D(h) - L_S(h)| < epsilon_n$ holds for all $h in mathcal{H}_n }$.






          share|cite|improve this answer











          $endgroup$













          • $begingroup$
            I'm a little confused on your notation of $A_n$, what is an element of that set?
            $endgroup$
            – Oliver G
            Jan 19 at 21:39












          • $begingroup$
            What does $S in D^n$ mean? I've never seen the notation of a set of samples being an element of a distribution. The notation $S text{~} D^n$ is what is in the book and means the $n$ elements of the set $S$ are chosen according to distribution $D$.
            $endgroup$
            – Oliver G
            Jan 19 at 21:50










          • $begingroup$
            And what do you mean by $Omega = D^n$? Because to me that reads: the sample space is the distribution function that maps events from a sample space to $[0,1]$.
            $endgroup$
            – Oliver G
            Jan 19 at 22:00






          • 1




            $begingroup$
            That is what I thought but it is pretty hard to try to infer the content of a book from one screenshot... Either way, then the sample space is whatever domain $S$ belongs to to the $n$. If you upload the relevant screenshots for the definitions it might be easier for people to help you.
            $endgroup$
            – E-A
            Jan 20 at 2:02














          0












          0








          0





          $begingroup$

          Let your events be $A_n = { |L_D(h) - L_S(h)| < epsilon_n$ holds for all $h in mathcal{H}_n }$.



          $P(A_n^c) < delta_n $ as per the previous paragraph, where I am using $c$ to denote complement. Now, we are looking for the event that all $A_n$'s happen, i.e. for the $P(bigcap_{i=1}^n A_n) = 1 - P(bigcup_{i=1}^n A_n^c)$. Now, using union bound on that last term we get $P(bigcup_{i=1}^n A_n^c) leq sum_i P(A_i^c) leq sum_i delta_i$, so $P(bigcap_{i=1}^n A_n)$ is at least $1 - sum_i delta_i$.





          Adding the elements of the set as per OPs request: the theorem says that the space we are working in is $Omega = mathcal{D}^n$, and we pick some $S$ in it. So, more formally, we have
          $A_n = { S in mathcal{D}^n: |L_D(h) - L_S(h)| < epsilon_n$ holds for all $h in mathcal{H}_n }$.






          share|cite|improve this answer











          $endgroup$



          Let your events be $A_n = { |L_D(h) - L_S(h)| < epsilon_n$ holds for all $h in mathcal{H}_n }$.



          $P(A_n^c) < delta_n $ as per the previous paragraph, where I am using $c$ to denote complement. Now, we are looking for the event that all $A_n$'s happen, i.e. for the $P(bigcap_{i=1}^n A_n) = 1 - P(bigcup_{i=1}^n A_n^c)$. Now, using union bound on that last term we get $P(bigcup_{i=1}^n A_n^c) leq sum_i P(A_i^c) leq sum_i delta_i$, so $P(bigcap_{i=1}^n A_n)$ is at least $1 - sum_i delta_i$.





          Adding the elements of the set as per OPs request: the theorem says that the space we are working in is $Omega = mathcal{D}^n$, and we pick some $S$ in it. So, more formally, we have
          $A_n = { S in mathcal{D}^n: |L_D(h) - L_S(h)| < epsilon_n$ holds for all $h in mathcal{H}_n }$.







          share|cite|improve this answer














          share|cite|improve this answer



          share|cite|improve this answer








          edited Jan 19 at 21:44

























          answered Jan 19 at 21:10









          E-AE-A

          2,1121414




          2,1121414












          • $begingroup$
            I'm a little confused on your notation of $A_n$, what is an element of that set?
            $endgroup$
            – Oliver G
            Jan 19 at 21:39












          • $begingroup$
            What does $S in D^n$ mean? I've never seen the notation of a set of samples being an element of a distribution. The notation $S text{~} D^n$ is what is in the book and means the $n$ elements of the set $S$ are chosen according to distribution $D$.
            $endgroup$
            – Oliver G
            Jan 19 at 21:50










          • $begingroup$
            And what do you mean by $Omega = D^n$? Because to me that reads: the sample space is the distribution function that maps events from a sample space to $[0,1]$.
            $endgroup$
            – Oliver G
            Jan 19 at 22:00






          • 1




            $begingroup$
            That is what I thought but it is pretty hard to try to infer the content of a book from one screenshot... Either way, then the sample space is whatever domain $S$ belongs to to the $n$. If you upload the relevant screenshots for the definitions it might be easier for people to help you.
            $endgroup$
            – E-A
            Jan 20 at 2:02


















          • $begingroup$
            I'm a little confused on your notation of $A_n$, what is an element of that set?
            $endgroup$
            – Oliver G
            Jan 19 at 21:39












          • $begingroup$
            What does $S in D^n$ mean? I've never seen the notation of a set of samples being an element of a distribution. The notation $S text{~} D^n$ is what is in the book and means the $n$ elements of the set $S$ are chosen according to distribution $D$.
            $endgroup$
            – Oliver G
            Jan 19 at 21:50










          • $begingroup$
            And what do you mean by $Omega = D^n$? Because to me that reads: the sample space is the distribution function that maps events from a sample space to $[0,1]$.
            $endgroup$
            – Oliver G
            Jan 19 at 22:00






          • 1




            $begingroup$
            That is what I thought but it is pretty hard to try to infer the content of a book from one screenshot... Either way, then the sample space is whatever domain $S$ belongs to to the $n$. If you upload the relevant screenshots for the definitions it might be easier for people to help you.
            $endgroup$
            – E-A
            Jan 20 at 2:02
















          $begingroup$
          I'm a little confused on your notation of $A_n$, what is an element of that set?
          $endgroup$
          – Oliver G
          Jan 19 at 21:39






          $begingroup$
          I'm a little confused on your notation of $A_n$, what is an element of that set?
          $endgroup$
          – Oliver G
          Jan 19 at 21:39














          $begingroup$
          What does $S in D^n$ mean? I've never seen the notation of a set of samples being an element of a distribution. The notation $S text{~} D^n$ is what is in the book and means the $n$ elements of the set $S$ are chosen according to distribution $D$.
          $endgroup$
          – Oliver G
          Jan 19 at 21:50




          $begingroup$
          What does $S in D^n$ mean? I've never seen the notation of a set of samples being an element of a distribution. The notation $S text{~} D^n$ is what is in the book and means the $n$ elements of the set $S$ are chosen according to distribution $D$.
          $endgroup$
          – Oliver G
          Jan 19 at 21:50












          $begingroup$
          And what do you mean by $Omega = D^n$? Because to me that reads: the sample space is the distribution function that maps events from a sample space to $[0,1]$.
          $endgroup$
          – Oliver G
          Jan 19 at 22:00




          $begingroup$
          And what do you mean by $Omega = D^n$? Because to me that reads: the sample space is the distribution function that maps events from a sample space to $[0,1]$.
          $endgroup$
          – Oliver G
          Jan 19 at 22:00




          1




          1




          $begingroup$
          That is what I thought but it is pretty hard to try to infer the content of a book from one screenshot... Either way, then the sample space is whatever domain $S$ belongs to to the $n$. If you upload the relevant screenshots for the definitions it might be easier for people to help you.
          $endgroup$
          – E-A
          Jan 20 at 2:02




          $begingroup$
          That is what I thought but it is pretty hard to try to infer the content of a book from one screenshot... Either way, then the sample space is whatever domain $S$ belongs to to the $n$. If you upload the relevant screenshots for the definitions it might be easier for people to help you.
          $endgroup$
          – E-A
          Jan 20 at 2:02


















          draft saved

          draft discarded




















































          Thanks for contributing an answer to Mathematics Stack Exchange!


          • Please be sure to answer the question. Provide details and share your research!

          But avoid



          • Asking for help, clarification, or responding to other answers.

          • Making statements based on opinion; back them up with references or personal experience.


          Use MathJax to format equations. MathJax reference.


          To learn more, see our tips on writing great answers.




          draft saved


          draft discarded














          StackExchange.ready(
          function () {
          StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fmath.stackexchange.com%2fquestions%2f3079732%2fhow-is-the-union-bound-applied-in-this-proof%23new-answer', 'question_page');
          }
          );

          Post as a guest















          Required, but never shown





















































          Required, but never shown














          Required, but never shown












          Required, but never shown







          Required, but never shown

































          Required, but never shown














          Required, but never shown












          Required, but never shown







          Required, but never shown







          Popular posts from this blog

          Can a sorcerer learn a 5th-level spell early by creating spell slots using the Font of Magic feature?

          Does disintegrating a polymorphed enemy still kill it after the 2018 errata?

          A Topological Invariant for $pi_3(U(n))$