Given some statistical measures, reconstruct a list of numbers












0














Suppose I have a list of numbers $(y_1, y_2, y_3, dots, y_N)$ with these properties:



$$ sum_{i=1}^{N}y_i = 13, 776, 663, $$
$$ bar{y} = dfrac{1}{N} sum_{i=1}^{N}y_i = 17,135, $$
$$ s^2 = dfrac{1}{N-1} sum_{i=1}^{N}(y_i - bar{y})^2 = 139,147^2. $$



That list has these numbers:




  • The lowest is $19$.

  • The $5$th percentile is $336$

  • The $25$th percentile is $800$

  • The median is $1,668$

  • The $75$th percentile is $5,050$

  • The $95$th percentile is $30,295$

  • The highest is $2,627,319$


These percentiles give you some idea about the distribution of the numbers. I can construct a list that has that mean and standard deviation, it doesn't matter if any $y_i$ is less than zero or if it does not follow the described distribution. The problem I face is to construct a list with mean $bar y$ and standard deviation $s$ subject to the condition that every $y_i$ has to be greater than zero (it doesn't have to follow that distribution and it doesn't have to have those numbers).



So I am looking for a way to do that. If anybody has any ideas about this, I'm happy to hear them!










share|cite|improve this question



























    0














    Suppose I have a list of numbers $(y_1, y_2, y_3, dots, y_N)$ with these properties:



    $$ sum_{i=1}^{N}y_i = 13, 776, 663, $$
    $$ bar{y} = dfrac{1}{N} sum_{i=1}^{N}y_i = 17,135, $$
    $$ s^2 = dfrac{1}{N-1} sum_{i=1}^{N}(y_i - bar{y})^2 = 139,147^2. $$



    That list has these numbers:




    • The lowest is $19$.

    • The $5$th percentile is $336$

    • The $25$th percentile is $800$

    • The median is $1,668$

    • The $75$th percentile is $5,050$

    • The $95$th percentile is $30,295$

    • The highest is $2,627,319$


    These percentiles give you some idea about the distribution of the numbers. I can construct a list that has that mean and standard deviation, it doesn't matter if any $y_i$ is less than zero or if it does not follow the described distribution. The problem I face is to construct a list with mean $bar y$ and standard deviation $s$ subject to the condition that every $y_i$ has to be greater than zero (it doesn't have to follow that distribution and it doesn't have to have those numbers).



    So I am looking for a way to do that. If anybody has any ideas about this, I'm happy to hear them!










    share|cite|improve this question

























      0












      0








      0







      Suppose I have a list of numbers $(y_1, y_2, y_3, dots, y_N)$ with these properties:



      $$ sum_{i=1}^{N}y_i = 13, 776, 663, $$
      $$ bar{y} = dfrac{1}{N} sum_{i=1}^{N}y_i = 17,135, $$
      $$ s^2 = dfrac{1}{N-1} sum_{i=1}^{N}(y_i - bar{y})^2 = 139,147^2. $$



      That list has these numbers:




      • The lowest is $19$.

      • The $5$th percentile is $336$

      • The $25$th percentile is $800$

      • The median is $1,668$

      • The $75$th percentile is $5,050$

      • The $95$th percentile is $30,295$

      • The highest is $2,627,319$


      These percentiles give you some idea about the distribution of the numbers. I can construct a list that has that mean and standard deviation, it doesn't matter if any $y_i$ is less than zero or if it does not follow the described distribution. The problem I face is to construct a list with mean $bar y$ and standard deviation $s$ subject to the condition that every $y_i$ has to be greater than zero (it doesn't have to follow that distribution and it doesn't have to have those numbers).



      So I am looking for a way to do that. If anybody has any ideas about this, I'm happy to hear them!










      share|cite|improve this question













      Suppose I have a list of numbers $(y_1, y_2, y_3, dots, y_N)$ with these properties:



      $$ sum_{i=1}^{N}y_i = 13, 776, 663, $$
      $$ bar{y} = dfrac{1}{N} sum_{i=1}^{N}y_i = 17,135, $$
      $$ s^2 = dfrac{1}{N-1} sum_{i=1}^{N}(y_i - bar{y})^2 = 139,147^2. $$



      That list has these numbers:




      • The lowest is $19$.

      • The $5$th percentile is $336$

      • The $25$th percentile is $800$

      • The median is $1,668$

      • The $75$th percentile is $5,050$

      • The $95$th percentile is $30,295$

      • The highest is $2,627,319$


      These percentiles give you some idea about the distribution of the numbers. I can construct a list that has that mean and standard deviation, it doesn't matter if any $y_i$ is less than zero or if it does not follow the described distribution. The problem I face is to construct a list with mean $bar y$ and standard deviation $s$ subject to the condition that every $y_i$ has to be greater than zero (it doesn't have to follow that distribution and it doesn't have to have those numbers).



      So I am looking for a way to do that. If anybody has any ideas about this, I'm happy to hear them!







      statistics standard-deviation means






      share|cite|improve this question













      share|cite|improve this question











      share|cite|improve this question




      share|cite|improve this question










      asked Nov 21 '18 at 3:13









      David

      784410




      784410






















          3 Answers
          3






          active

          oldest

          votes


















          1














          The solution is probably not unique, and you would want to do it numerically. I would use the approach found in Datasaurus dataset.
          The first step is to find $N$. From the first two equations you get $Napprox804$. Since $N$ is not exactly an integer, the first indication that I have that these numbers are just an approximation. The last equation gives you $bar{y^2}$. Now choose $y_1=19$ and $y_{408}=2627319$. You can now recalculate $bar y$ and $bar{y^2}$ without those values. Put $203$ values on the median and the other $203$ remaining at a value such that the average (or the sum) is your desired value. Obviously, $bar{y^2}$ is going to be wrong. Move one value from the median down, somewhere in the lower 5th percentile. To get the same average, you must move at least one value from the higher dataset upward. Check if moving one value or moving two values higher will improve your $bar{y^2}$. You need to repeat this procedure until all your conditions are met.






          share|cite|improve this answer





















          • Follow the link to the code on AutoDesk Research.
            – Andrei
            Nov 21 '18 at 3:53










          • I like this approach. If I succeed, I will mark your answer as the correct answer (of course, if there is no better answer).
            – David
            Nov 21 '18 at 3:58



















          1














          So definitely every number is positive, because the lowest is 19.



          This seems tractible to me, assuming it's possible. My recommendation is to simply start with an arbitrary list satisfying the bottom list of conditions. These can be thought of as fixed "milestones". Then simply move the other numbers around until you satisfy the mean and standard deviation.



          By moving different elements (ie the largest elements, smallest elements, or ones in the middle) to move around), and moving them up versus down, you can increase or decrease the mean and standard deviation as necessary. With some thought (or some experimentation), you'll be able to figure out what to do from here.






          share|cite|improve this answer





























            0














            Ok, I managed to find an answer to my question. I wanted to do it numerically and I used Python. Here is the code:



            import statistics as stat
            import random
            import sys
            from scipy.optimize import fsolve, root
            import matplotlib.pyplot as plt

            random.seed(210)

            N = 804
            y_mean = 17_135
            y_sd = 139_147
            median = 1668
            lowest = 19
            highest_95 = 30295

            l_nums =
            rango1 = range(lowest, median)
            rango2 = range(median, highest_95)

            for _ in range(N // 2):

            numero = random.choice(rango1)
            l_nums.append(numero)

            for _ in range(N // 2 - 3):

            numero = random.choice(rango2)
            l_nums.append(numero)

            l_nums.append(2627319)

            print(len(l_nums))
            print(stat.mean(l_nums), stat.stdev(l_nums))

            #sys.exit('!')

            def equations(x):

            a = sum(l_nums)
            b = sum(map(lambda x: (x - y_mean)**2, l_nums))

            f = [a + x[0] + x[1] - y_mean * N,
            b + (x[0] - y_mean)**2 + (x[1] - y_mean)**2 - y_sd**2 * (N - 1)]

            return f

            x_sol = root(equations, [5e8, 5e8], method='lm')

            #print(x_sol)
            print(x_sol.fun)
            print(x_sol.x)

            l_nums.extend(x_sol.x)
            print(len(l_nums))
            print(stat.mean(l_nums), stat.stdev(l_nums))


            I explain my code. First, find $N$, in this case $N = 804$. Create two lists of numbers, one between $19$ and the median, the other between the median and $30295$. In Python



            rango1 = range(lowest, median)
            rango2 = range(median, highest_95)


            From rango1, draw $N/2$ numbers randomly and put them in a list. Then, from rango2, draw $N/2 -3$ numbers randomly and add them to that list. Now you have a list with $801$ numbers. Good. As you can see, the highest number is $2,627,319$, add it.



            l_nums.append(2627319)


            To find the last two numbers, you have to solve two equations



            $$frac{x+y+a}{N}=bar y,$$



            $$dfrac{(x-bar y)^2+(y-bar y)^2+ b}{N-1}=s^2.$$



            That is done with Scipy. In my case, I have to add the line random.seed(210) in order to get the exact results, which depends on the operative system and the computer. Without that line, the results are close.






            share|cite|improve this answer





















              Your Answer





              StackExchange.ifUsing("editor", function () {
              return StackExchange.using("mathjaxEditing", function () {
              StackExchange.MarkdownEditor.creationCallbacks.add(function (editor, postfix) {
              StackExchange.mathjaxEditing.prepareWmdForMathJax(editor, postfix, [["$", "$"], ["\\(","\\)"]]);
              });
              });
              }, "mathjax-editing");

              StackExchange.ready(function() {
              var channelOptions = {
              tags: "".split(" "),
              id: "69"
              };
              initTagRenderer("".split(" "), "".split(" "), channelOptions);

              StackExchange.using("externalEditor", function() {
              // Have to fire editor after snippets, if snippets enabled
              if (StackExchange.settings.snippets.snippetsEnabled) {
              StackExchange.using("snippets", function() {
              createEditor();
              });
              }
              else {
              createEditor();
              }
              });

              function createEditor() {
              StackExchange.prepareEditor({
              heartbeatType: 'answer',
              autoActivateHeartbeat: false,
              convertImagesToLinks: true,
              noModals: true,
              showLowRepImageUploadWarning: true,
              reputationToPostImages: 10,
              bindNavPrevention: true,
              postfix: "",
              imageUploader: {
              brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
              contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
              allowUrls: true
              },
              noCode: true, onDemand: true,
              discardSelector: ".discard-answer"
              ,immediatelyShowMarkdownHelp:true
              });


              }
              });














              draft saved

              draft discarded


















              StackExchange.ready(
              function () {
              StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fmath.stackexchange.com%2fquestions%2f3007207%2fgiven-some-statistical-measures-reconstruct-a-list-of-numbers%23new-answer', 'question_page');
              }
              );

              Post as a guest















              Required, but never shown

























              3 Answers
              3






              active

              oldest

              votes








              3 Answers
              3






              active

              oldest

              votes









              active

              oldest

              votes






              active

              oldest

              votes









              1














              The solution is probably not unique, and you would want to do it numerically. I would use the approach found in Datasaurus dataset.
              The first step is to find $N$. From the first two equations you get $Napprox804$. Since $N$ is not exactly an integer, the first indication that I have that these numbers are just an approximation. The last equation gives you $bar{y^2}$. Now choose $y_1=19$ and $y_{408}=2627319$. You can now recalculate $bar y$ and $bar{y^2}$ without those values. Put $203$ values on the median and the other $203$ remaining at a value such that the average (or the sum) is your desired value. Obviously, $bar{y^2}$ is going to be wrong. Move one value from the median down, somewhere in the lower 5th percentile. To get the same average, you must move at least one value from the higher dataset upward. Check if moving one value or moving two values higher will improve your $bar{y^2}$. You need to repeat this procedure until all your conditions are met.






              share|cite|improve this answer





















              • Follow the link to the code on AutoDesk Research.
                – Andrei
                Nov 21 '18 at 3:53










              • I like this approach. If I succeed, I will mark your answer as the correct answer (of course, if there is no better answer).
                – David
                Nov 21 '18 at 3:58
















              1














              The solution is probably not unique, and you would want to do it numerically. I would use the approach found in Datasaurus dataset.
              The first step is to find $N$. From the first two equations you get $Napprox804$. Since $N$ is not exactly an integer, the first indication that I have that these numbers are just an approximation. The last equation gives you $bar{y^2}$. Now choose $y_1=19$ and $y_{408}=2627319$. You can now recalculate $bar y$ and $bar{y^2}$ without those values. Put $203$ values on the median and the other $203$ remaining at a value such that the average (or the sum) is your desired value. Obviously, $bar{y^2}$ is going to be wrong. Move one value from the median down, somewhere in the lower 5th percentile. To get the same average, you must move at least one value from the higher dataset upward. Check if moving one value or moving two values higher will improve your $bar{y^2}$. You need to repeat this procedure until all your conditions are met.






              share|cite|improve this answer





















              • Follow the link to the code on AutoDesk Research.
                – Andrei
                Nov 21 '18 at 3:53










              • I like this approach. If I succeed, I will mark your answer as the correct answer (of course, if there is no better answer).
                – David
                Nov 21 '18 at 3:58














              1












              1








              1






              The solution is probably not unique, and you would want to do it numerically. I would use the approach found in Datasaurus dataset.
              The first step is to find $N$. From the first two equations you get $Napprox804$. Since $N$ is not exactly an integer, the first indication that I have that these numbers are just an approximation. The last equation gives you $bar{y^2}$. Now choose $y_1=19$ and $y_{408}=2627319$. You can now recalculate $bar y$ and $bar{y^2}$ without those values. Put $203$ values on the median and the other $203$ remaining at a value such that the average (or the sum) is your desired value. Obviously, $bar{y^2}$ is going to be wrong. Move one value from the median down, somewhere in the lower 5th percentile. To get the same average, you must move at least one value from the higher dataset upward. Check if moving one value or moving two values higher will improve your $bar{y^2}$. You need to repeat this procedure until all your conditions are met.






              share|cite|improve this answer












              The solution is probably not unique, and you would want to do it numerically. I would use the approach found in Datasaurus dataset.
              The first step is to find $N$. From the first two equations you get $Napprox804$. Since $N$ is not exactly an integer, the first indication that I have that these numbers are just an approximation. The last equation gives you $bar{y^2}$. Now choose $y_1=19$ and $y_{408}=2627319$. You can now recalculate $bar y$ and $bar{y^2}$ without those values. Put $203$ values on the median and the other $203$ remaining at a value such that the average (or the sum) is your desired value. Obviously, $bar{y^2}$ is going to be wrong. Move one value from the median down, somewhere in the lower 5th percentile. To get the same average, you must move at least one value from the higher dataset upward. Check if moving one value or moving two values higher will improve your $bar{y^2}$. You need to repeat this procedure until all your conditions are met.







              share|cite|improve this answer












              share|cite|improve this answer



              share|cite|improve this answer










              answered Nov 21 '18 at 3:51









              Andrei

              11.3k21026




              11.3k21026












              • Follow the link to the code on AutoDesk Research.
                – Andrei
                Nov 21 '18 at 3:53










              • I like this approach. If I succeed, I will mark your answer as the correct answer (of course, if there is no better answer).
                – David
                Nov 21 '18 at 3:58


















              • Follow the link to the code on AutoDesk Research.
                – Andrei
                Nov 21 '18 at 3:53










              • I like this approach. If I succeed, I will mark your answer as the correct answer (of course, if there is no better answer).
                – David
                Nov 21 '18 at 3:58
















              Follow the link to the code on AutoDesk Research.
              – Andrei
              Nov 21 '18 at 3:53




              Follow the link to the code on AutoDesk Research.
              – Andrei
              Nov 21 '18 at 3:53












              I like this approach. If I succeed, I will mark your answer as the correct answer (of course, if there is no better answer).
              – David
              Nov 21 '18 at 3:58




              I like this approach. If I succeed, I will mark your answer as the correct answer (of course, if there is no better answer).
              – David
              Nov 21 '18 at 3:58











              1














              So definitely every number is positive, because the lowest is 19.



              This seems tractible to me, assuming it's possible. My recommendation is to simply start with an arbitrary list satisfying the bottom list of conditions. These can be thought of as fixed "milestones". Then simply move the other numbers around until you satisfy the mean and standard deviation.



              By moving different elements (ie the largest elements, smallest elements, or ones in the middle) to move around), and moving them up versus down, you can increase or decrease the mean and standard deviation as necessary. With some thought (or some experimentation), you'll be able to figure out what to do from here.






              share|cite|improve this answer


























                1














                So definitely every number is positive, because the lowest is 19.



                This seems tractible to me, assuming it's possible. My recommendation is to simply start with an arbitrary list satisfying the bottom list of conditions. These can be thought of as fixed "milestones". Then simply move the other numbers around until you satisfy the mean and standard deviation.



                By moving different elements (ie the largest elements, smallest elements, or ones in the middle) to move around), and moving them up versus down, you can increase or decrease the mean and standard deviation as necessary. With some thought (or some experimentation), you'll be able to figure out what to do from here.






                share|cite|improve this answer
























                  1












                  1








                  1






                  So definitely every number is positive, because the lowest is 19.



                  This seems tractible to me, assuming it's possible. My recommendation is to simply start with an arbitrary list satisfying the bottom list of conditions. These can be thought of as fixed "milestones". Then simply move the other numbers around until you satisfy the mean and standard deviation.



                  By moving different elements (ie the largest elements, smallest elements, or ones in the middle) to move around), and moving them up versus down, you can increase or decrease the mean and standard deviation as necessary. With some thought (or some experimentation), you'll be able to figure out what to do from here.






                  share|cite|improve this answer












                  So definitely every number is positive, because the lowest is 19.



                  This seems tractible to me, assuming it's possible. My recommendation is to simply start with an arbitrary list satisfying the bottom list of conditions. These can be thought of as fixed "milestones". Then simply move the other numbers around until you satisfy the mean and standard deviation.



                  By moving different elements (ie the largest elements, smallest elements, or ones in the middle) to move around), and moving them up versus down, you can increase or decrease the mean and standard deviation as necessary. With some thought (or some experimentation), you'll be able to figure out what to do from here.







                  share|cite|improve this answer












                  share|cite|improve this answer



                  share|cite|improve this answer










                  answered Nov 21 '18 at 3:58









                  Nate 8

                  48426




                  48426























                      0














                      Ok, I managed to find an answer to my question. I wanted to do it numerically and I used Python. Here is the code:



                      import statistics as stat
                      import random
                      import sys
                      from scipy.optimize import fsolve, root
                      import matplotlib.pyplot as plt

                      random.seed(210)

                      N = 804
                      y_mean = 17_135
                      y_sd = 139_147
                      median = 1668
                      lowest = 19
                      highest_95 = 30295

                      l_nums =
                      rango1 = range(lowest, median)
                      rango2 = range(median, highest_95)

                      for _ in range(N // 2):

                      numero = random.choice(rango1)
                      l_nums.append(numero)

                      for _ in range(N // 2 - 3):

                      numero = random.choice(rango2)
                      l_nums.append(numero)

                      l_nums.append(2627319)

                      print(len(l_nums))
                      print(stat.mean(l_nums), stat.stdev(l_nums))

                      #sys.exit('!')

                      def equations(x):

                      a = sum(l_nums)
                      b = sum(map(lambda x: (x - y_mean)**2, l_nums))

                      f = [a + x[0] + x[1] - y_mean * N,
                      b + (x[0] - y_mean)**2 + (x[1] - y_mean)**2 - y_sd**2 * (N - 1)]

                      return f

                      x_sol = root(equations, [5e8, 5e8], method='lm')

                      #print(x_sol)
                      print(x_sol.fun)
                      print(x_sol.x)

                      l_nums.extend(x_sol.x)
                      print(len(l_nums))
                      print(stat.mean(l_nums), stat.stdev(l_nums))


                      I explain my code. First, find $N$, in this case $N = 804$. Create two lists of numbers, one between $19$ and the median, the other between the median and $30295$. In Python



                      rango1 = range(lowest, median)
                      rango2 = range(median, highest_95)


                      From rango1, draw $N/2$ numbers randomly and put them in a list. Then, from rango2, draw $N/2 -3$ numbers randomly and add them to that list. Now you have a list with $801$ numbers. Good. As you can see, the highest number is $2,627,319$, add it.



                      l_nums.append(2627319)


                      To find the last two numbers, you have to solve two equations



                      $$frac{x+y+a}{N}=bar y,$$



                      $$dfrac{(x-bar y)^2+(y-bar y)^2+ b}{N-1}=s^2.$$



                      That is done with Scipy. In my case, I have to add the line random.seed(210) in order to get the exact results, which depends on the operative system and the computer. Without that line, the results are close.






                      share|cite|improve this answer


























                        0














                        Ok, I managed to find an answer to my question. I wanted to do it numerically and I used Python. Here is the code:



                        import statistics as stat
                        import random
                        import sys
                        from scipy.optimize import fsolve, root
                        import matplotlib.pyplot as plt

                        random.seed(210)

                        N = 804
                        y_mean = 17_135
                        y_sd = 139_147
                        median = 1668
                        lowest = 19
                        highest_95 = 30295

                        l_nums =
                        rango1 = range(lowest, median)
                        rango2 = range(median, highest_95)

                        for _ in range(N // 2):

                        numero = random.choice(rango1)
                        l_nums.append(numero)

                        for _ in range(N // 2 - 3):

                        numero = random.choice(rango2)
                        l_nums.append(numero)

                        l_nums.append(2627319)

                        print(len(l_nums))
                        print(stat.mean(l_nums), stat.stdev(l_nums))

                        #sys.exit('!')

                        def equations(x):

                        a = sum(l_nums)
                        b = sum(map(lambda x: (x - y_mean)**2, l_nums))

                        f = [a + x[0] + x[1] - y_mean * N,
                        b + (x[0] - y_mean)**2 + (x[1] - y_mean)**2 - y_sd**2 * (N - 1)]

                        return f

                        x_sol = root(equations, [5e8, 5e8], method='lm')

                        #print(x_sol)
                        print(x_sol.fun)
                        print(x_sol.x)

                        l_nums.extend(x_sol.x)
                        print(len(l_nums))
                        print(stat.mean(l_nums), stat.stdev(l_nums))


                        I explain my code. First, find $N$, in this case $N = 804$. Create two lists of numbers, one between $19$ and the median, the other between the median and $30295$. In Python



                        rango1 = range(lowest, median)
                        rango2 = range(median, highest_95)


                        From rango1, draw $N/2$ numbers randomly and put them in a list. Then, from rango2, draw $N/2 -3$ numbers randomly and add them to that list. Now you have a list with $801$ numbers. Good. As you can see, the highest number is $2,627,319$, add it.



                        l_nums.append(2627319)


                        To find the last two numbers, you have to solve two equations



                        $$frac{x+y+a}{N}=bar y,$$



                        $$dfrac{(x-bar y)^2+(y-bar y)^2+ b}{N-1}=s^2.$$



                        That is done with Scipy. In my case, I have to add the line random.seed(210) in order to get the exact results, which depends on the operative system and the computer. Without that line, the results are close.






                        share|cite|improve this answer
























                          0












                          0








                          0






                          Ok, I managed to find an answer to my question. I wanted to do it numerically and I used Python. Here is the code:



                          import statistics as stat
                          import random
                          import sys
                          from scipy.optimize import fsolve, root
                          import matplotlib.pyplot as plt

                          random.seed(210)

                          N = 804
                          y_mean = 17_135
                          y_sd = 139_147
                          median = 1668
                          lowest = 19
                          highest_95 = 30295

                          l_nums =
                          rango1 = range(lowest, median)
                          rango2 = range(median, highest_95)

                          for _ in range(N // 2):

                          numero = random.choice(rango1)
                          l_nums.append(numero)

                          for _ in range(N // 2 - 3):

                          numero = random.choice(rango2)
                          l_nums.append(numero)

                          l_nums.append(2627319)

                          print(len(l_nums))
                          print(stat.mean(l_nums), stat.stdev(l_nums))

                          #sys.exit('!')

                          def equations(x):

                          a = sum(l_nums)
                          b = sum(map(lambda x: (x - y_mean)**2, l_nums))

                          f = [a + x[0] + x[1] - y_mean * N,
                          b + (x[0] - y_mean)**2 + (x[1] - y_mean)**2 - y_sd**2 * (N - 1)]

                          return f

                          x_sol = root(equations, [5e8, 5e8], method='lm')

                          #print(x_sol)
                          print(x_sol.fun)
                          print(x_sol.x)

                          l_nums.extend(x_sol.x)
                          print(len(l_nums))
                          print(stat.mean(l_nums), stat.stdev(l_nums))


                          I explain my code. First, find $N$, in this case $N = 804$. Create two lists of numbers, one between $19$ and the median, the other between the median and $30295$. In Python



                          rango1 = range(lowest, median)
                          rango2 = range(median, highest_95)


                          From rango1, draw $N/2$ numbers randomly and put them in a list. Then, from rango2, draw $N/2 -3$ numbers randomly and add them to that list. Now you have a list with $801$ numbers. Good. As you can see, the highest number is $2,627,319$, add it.



                          l_nums.append(2627319)


                          To find the last two numbers, you have to solve two equations



                          $$frac{x+y+a}{N}=bar y,$$



                          $$dfrac{(x-bar y)^2+(y-bar y)^2+ b}{N-1}=s^2.$$



                          That is done with Scipy. In my case, I have to add the line random.seed(210) in order to get the exact results, which depends on the operative system and the computer. Without that line, the results are close.






                          share|cite|improve this answer












                          Ok, I managed to find an answer to my question. I wanted to do it numerically and I used Python. Here is the code:



                          import statistics as stat
                          import random
                          import sys
                          from scipy.optimize import fsolve, root
                          import matplotlib.pyplot as plt

                          random.seed(210)

                          N = 804
                          y_mean = 17_135
                          y_sd = 139_147
                          median = 1668
                          lowest = 19
                          highest_95 = 30295

                          l_nums =
                          rango1 = range(lowest, median)
                          rango2 = range(median, highest_95)

                          for _ in range(N // 2):

                          numero = random.choice(rango1)
                          l_nums.append(numero)

                          for _ in range(N // 2 - 3):

                          numero = random.choice(rango2)
                          l_nums.append(numero)

                          l_nums.append(2627319)

                          print(len(l_nums))
                          print(stat.mean(l_nums), stat.stdev(l_nums))

                          #sys.exit('!')

                          def equations(x):

                          a = sum(l_nums)
                          b = sum(map(lambda x: (x - y_mean)**2, l_nums))

                          f = [a + x[0] + x[1] - y_mean * N,
                          b + (x[0] - y_mean)**2 + (x[1] - y_mean)**2 - y_sd**2 * (N - 1)]

                          return f

                          x_sol = root(equations, [5e8, 5e8], method='lm')

                          #print(x_sol)
                          print(x_sol.fun)
                          print(x_sol.x)

                          l_nums.extend(x_sol.x)
                          print(len(l_nums))
                          print(stat.mean(l_nums), stat.stdev(l_nums))


                          I explain my code. First, find $N$, in this case $N = 804$. Create two lists of numbers, one between $19$ and the median, the other between the median and $30295$. In Python



                          rango1 = range(lowest, median)
                          rango2 = range(median, highest_95)


                          From rango1, draw $N/2$ numbers randomly and put them in a list. Then, from rango2, draw $N/2 -3$ numbers randomly and add them to that list. Now you have a list with $801$ numbers. Good. As you can see, the highest number is $2,627,319$, add it.



                          l_nums.append(2627319)


                          To find the last two numbers, you have to solve two equations



                          $$frac{x+y+a}{N}=bar y,$$



                          $$dfrac{(x-bar y)^2+(y-bar y)^2+ b}{N-1}=s^2.$$



                          That is done with Scipy. In my case, I have to add the line random.seed(210) in order to get the exact results, which depends on the operative system and the computer. Without that line, the results are close.







                          share|cite|improve this answer












                          share|cite|improve this answer



                          share|cite|improve this answer










                          answered Nov 22 '18 at 1:31









                          David

                          784410




                          784410






























                              draft saved

                              draft discarded




















































                              Thanks for contributing an answer to Mathematics Stack Exchange!


                              • Please be sure to answer the question. Provide details and share your research!

                              But avoid



                              • Asking for help, clarification, or responding to other answers.

                              • Making statements based on opinion; back them up with references or personal experience.


                              Use MathJax to format equations. MathJax reference.


                              To learn more, see our tips on writing great answers.





                              Some of your past answers have not been well-received, and you're in danger of being blocked from answering.


                              Please pay close attention to the following guidance:


                              • Please be sure to answer the question. Provide details and share your research!

                              But avoid



                              • Asking for help, clarification, or responding to other answers.

                              • Making statements based on opinion; back them up with references or personal experience.


                              To learn more, see our tips on writing great answers.




                              draft saved


                              draft discarded














                              StackExchange.ready(
                              function () {
                              StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fmath.stackexchange.com%2fquestions%2f3007207%2fgiven-some-statistical-measures-reconstruct-a-list-of-numbers%23new-answer', 'question_page');
                              }
                              );

                              Post as a guest















                              Required, but never shown





















































                              Required, but never shown














                              Required, but never shown












                              Required, but never shown







                              Required, but never shown

































                              Required, but never shown














                              Required, but never shown












                              Required, but never shown







                              Required, but never shown







                              Popular posts from this blog

                              Can a sorcerer learn a 5th-level spell early by creating spell slots using the Font of Magic feature?

                              Does disintegrating a polymorphed enemy still kill it after the 2018 errata?

                              A Topological Invariant for $pi_3(U(n))$