Urn problem: probability of m drawing balls for n simulation
$begingroup$
Problem Definition :
Inputs:
- j - the number of ball colours
- k - quantity of each ball colour
- i - total number of balls available (j*k)
- m - number of ball to randomly pick
- n - number of times to run the simulation
Outputs:
Numerical output of expected number of balls.
Looking for the equation where,
- m number of balls picked randomly from the bucket and determine the number of unique colours for single simulation
- Running simulation to determine the expected number of unique colours in after running this scenario n number of times (in other words, the average output of the simulation).
Found two similar kind of problems link1 and link2, however, it is difficult for me to understand a mathematical problem that matches with above definition of problem. So any help, hint or suggestion regarding same would be highly appreciated.
Thanks in advance.
probability sequences-and-series probability-theory polya-urn-model
$endgroup$
add a comment |
$begingroup$
Problem Definition :
Inputs:
- j - the number of ball colours
- k - quantity of each ball colour
- i - total number of balls available (j*k)
- m - number of ball to randomly pick
- n - number of times to run the simulation
Outputs:
Numerical output of expected number of balls.
Looking for the equation where,
- m number of balls picked randomly from the bucket and determine the number of unique colours for single simulation
- Running simulation to determine the expected number of unique colours in after running this scenario n number of times (in other words, the average output of the simulation).
Found two similar kind of problems link1 and link2, however, it is difficult for me to understand a mathematical problem that matches with above definition of problem. So any help, hint or suggestion regarding same would be highly appreciated.
Thanks in advance.
probability sequences-and-series probability-theory polya-urn-model
$endgroup$
$begingroup$
When picking m balls, are the balls picked kept out or put back after each pick during a simulation?
$endgroup$
– herb steinberg
Jan 19 at 20:54
$begingroup$
Hi thanks for the response, picked balls will be kept out.
$endgroup$
– Chitrang
Jan 19 at 21:12
$begingroup$
A minor stylistic note: I found it very confusing to have $j$ represent the number of colors and $i$ the total number of balls, because $i$ and $j$ are so much more commonly used as iteration variables, for example in $sum_{i = 1}^n,$ than as the total number of something. It's kind of like having someone graph a function $y = f(x)$ on a graph where the $y$ axis is horizontal and the $x$ axis is vertical. It's actually logically sound, but one has to keep reminding oneself not to assume the usual conventions.
$endgroup$
– David K
Jan 20 at 20:37
add a comment |
$begingroup$
Problem Definition :
Inputs:
- j - the number of ball colours
- k - quantity of each ball colour
- i - total number of balls available (j*k)
- m - number of ball to randomly pick
- n - number of times to run the simulation
Outputs:
Numerical output of expected number of balls.
Looking for the equation where,
- m number of balls picked randomly from the bucket and determine the number of unique colours for single simulation
- Running simulation to determine the expected number of unique colours in after running this scenario n number of times (in other words, the average output of the simulation).
Found two similar kind of problems link1 and link2, however, it is difficult for me to understand a mathematical problem that matches with above definition of problem. So any help, hint or suggestion regarding same would be highly appreciated.
Thanks in advance.
probability sequences-and-series probability-theory polya-urn-model
$endgroup$
Problem Definition :
Inputs:
- j - the number of ball colours
- k - quantity of each ball colour
- i - total number of balls available (j*k)
- m - number of ball to randomly pick
- n - number of times to run the simulation
Outputs:
Numerical output of expected number of balls.
Looking for the equation where,
- m number of balls picked randomly from the bucket and determine the number of unique colours for single simulation
- Running simulation to determine the expected number of unique colours in after running this scenario n number of times (in other words, the average output of the simulation).
Found two similar kind of problems link1 and link2, however, it is difficult for me to understand a mathematical problem that matches with above definition of problem. So any help, hint or suggestion regarding same would be highly appreciated.
Thanks in advance.
probability sequences-and-series probability-theory polya-urn-model
probability sequences-and-series probability-theory polya-urn-model
asked Jan 19 at 19:12


ChitrangChitrang
1011
1011
$begingroup$
When picking m balls, are the balls picked kept out or put back after each pick during a simulation?
$endgroup$
– herb steinberg
Jan 19 at 20:54
$begingroup$
Hi thanks for the response, picked balls will be kept out.
$endgroup$
– Chitrang
Jan 19 at 21:12
$begingroup$
A minor stylistic note: I found it very confusing to have $j$ represent the number of colors and $i$ the total number of balls, because $i$ and $j$ are so much more commonly used as iteration variables, for example in $sum_{i = 1}^n,$ than as the total number of something. It's kind of like having someone graph a function $y = f(x)$ on a graph where the $y$ axis is horizontal and the $x$ axis is vertical. It's actually logically sound, but one has to keep reminding oneself not to assume the usual conventions.
$endgroup$
– David K
Jan 20 at 20:37
add a comment |
$begingroup$
When picking m balls, are the balls picked kept out or put back after each pick during a simulation?
$endgroup$
– herb steinberg
Jan 19 at 20:54
$begingroup$
Hi thanks for the response, picked balls will be kept out.
$endgroup$
– Chitrang
Jan 19 at 21:12
$begingroup$
A minor stylistic note: I found it very confusing to have $j$ represent the number of colors and $i$ the total number of balls, because $i$ and $j$ are so much more commonly used as iteration variables, for example in $sum_{i = 1}^n,$ than as the total number of something. It's kind of like having someone graph a function $y = f(x)$ on a graph where the $y$ axis is horizontal and the $x$ axis is vertical. It's actually logically sound, but one has to keep reminding oneself not to assume the usual conventions.
$endgroup$
– David K
Jan 20 at 20:37
$begingroup$
When picking m balls, are the balls picked kept out or put back after each pick during a simulation?
$endgroup$
– herb steinberg
Jan 19 at 20:54
$begingroup$
When picking m balls, are the balls picked kept out or put back after each pick during a simulation?
$endgroup$
– herb steinberg
Jan 19 at 20:54
$begingroup$
Hi thanks for the response, picked balls will be kept out.
$endgroup$
– Chitrang
Jan 19 at 21:12
$begingroup$
Hi thanks for the response, picked balls will be kept out.
$endgroup$
– Chitrang
Jan 19 at 21:12
$begingroup$
A minor stylistic note: I found it very confusing to have $j$ represent the number of colors and $i$ the total number of balls, because $i$ and $j$ are so much more commonly used as iteration variables, for example in $sum_{i = 1}^n,$ than as the total number of something. It's kind of like having someone graph a function $y = f(x)$ on a graph where the $y$ axis is horizontal and the $x$ axis is vertical. It's actually logically sound, but one has to keep reminding oneself not to assume the usual conventions.
$endgroup$
– David K
Jan 20 at 20:37
$begingroup$
A minor stylistic note: I found it very confusing to have $j$ represent the number of colors and $i$ the total number of balls, because $i$ and $j$ are so much more commonly used as iteration variables, for example in $sum_{i = 1}^n,$ than as the total number of something. It's kind of like having someone graph a function $y = f(x)$ on a graph where the $y$ axis is horizontal and the $x$ axis is vertical. It's actually logically sound, but one has to keep reminding oneself not to assume the usual conventions.
$endgroup$
– David K
Jan 20 at 20:37
add a comment |
1 Answer
1
active
oldest
votes
$begingroup$
Of your two links, the second link (Expected number of different colors)
is the one that is more similar to yours.
The two differences are that it assumes drawing with replacement
(whereas you are drawing without replacement) and that it allows filling the bin with unequal numbers of balls of different colors.
We can still apply the basic techniques of that problem's solution:
indicator variables, and the linearity of expectation.
We assign a "color number" to each color that occurs in the urn.
For example, if the colors of balls in the urn are red, green, and black,
we might say red is color $1,$ green is color $2,$ and black is color $3.$
Then, for a single drawing, define $B_r$ as the number of balls drawn from the balls of color number $r$.
Next, we define the notation $[B_r > 0]$ so that $[B_r > 0] = 1$ if $B_r > 0$
and $[B_r > 0] = 0$ if $B_r = 0.$
The expression $[B_r > 0]$ is an indicator variable that says whether any balls of color $r$ were drawn.
Then
begin{align}
E([B_r > 0]) &= P([B_r > 0] = 1)cdot 1 + P([B_r > 0] = 0) cdot 0 \
&= P(B_r > 0) \
&= 1 - P(B_r = 0).
end{align}
To evaluate $P(B_r = 0),$ the probability that no balls of color $r$ are drawn,
we use the assumption that each $m$-ball subset of the $i$ balls in the urn is equally likely to be drawn as any other $m$-ball subset of the $i$ balls.
(You do assume that, right?)
There are $binom im$ subsets that contain $m$ balls, and of these,
$binom {i - k}m$ subsets that do not contain any balls of color $r.$
Therefore
begin{align}
P(B_r = 0) &= frac{binom {i - k}m}{binom im} \
&= frac{(i - k)!(i - m)!}{i!(i-m-k)!}\
&= frac{(i-m)(i - m - 1)cdots(i - m - k + 1)}{i(i-1)cdots(i - m + 1)}.
end{align}
That's three ways to write the formula for this probability; I like the first way best so I will continue writing it that way.
We therefore can write the expected value of $[B_r > 0]$,
$$
E([B_r > 0]) = 1 - frac{binom {i - k}m}{binom im},
$$
and since expectation is linear, we can add the expected values of
$[B_1 > 0],$ $[B_2 > 0],$ etc., up through $[B_j > 0]$ to get the
expected value of the sum
$[B_1 > 0] + [B_2 > 0] + [B_j > 0]$,
which literally counts the number of colors that were drawn:
begin{align}
Eleft(sum_{r=1}^j [B_r > 0]right)
&= sum_{r=1}^j E([B_r > 0]) \
&= sum_{r=1}^j left(1 - frac{binom {i - k}m}{binom im}right) \
&= j left(1 - frac{binom {i - k}m}{binom im}right).
end{align}
That's the expected number of colors in one drawing.
So that fact that you drew without replacement makes the calculation slightly more complicated
(if you consider ${binom {i - k}m}/{binom im}$ more complicated
than $left(1 - frac 1jright)^m$),
but the fact that you have the same number of balls of each color makes the problem simpler, because we get to multiply by $j$ instead of having to write the final answer in the $sum$ format.
Now when you make $n$ independent drawings, replacing all the balls after each drawing, again the linearity of expectation tells us that the expected average number of different colors per drawing is the same as the expected number of different colors in a single drawing.
The only thing that changes when we go to a large number of drawings is that
for a single drawing, it may be quite likely that you will see one or two colors more than expected or one or two colors fewer,
but with a large number of drawings, the average number of colors is not so likely to be much different from the expected value.
In other words, the expected value computed for a single drawing is the long-run average that you asked for.
$endgroup$
add a comment |
Your Answer
StackExchange.ifUsing("editor", function () {
return StackExchange.using("mathjaxEditing", function () {
StackExchange.MarkdownEditor.creationCallbacks.add(function (editor, postfix) {
StackExchange.mathjaxEditing.prepareWmdForMathJax(editor, postfix, [["$", "$"], ["\\(","\\)"]]);
});
});
}, "mathjax-editing");
StackExchange.ready(function() {
var channelOptions = {
tags: "".split(" "),
id: "69"
};
initTagRenderer("".split(" "), "".split(" "), channelOptions);
StackExchange.using("externalEditor", function() {
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled) {
StackExchange.using("snippets", function() {
createEditor();
});
}
else {
createEditor();
}
});
function createEditor() {
StackExchange.prepareEditor({
heartbeatType: 'answer',
autoActivateHeartbeat: false,
convertImagesToLinks: true,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: 10,
bindNavPrevention: true,
postfix: "",
imageUploader: {
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
},
noCode: true, onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
});
}
});
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fmath.stackexchange.com%2fquestions%2f3079698%2furn-problem-probability-of-m-drawing-balls-for-n-simulation%23new-answer', 'question_page');
}
);
Post as a guest
Required, but never shown
1 Answer
1
active
oldest
votes
1 Answer
1
active
oldest
votes
active
oldest
votes
active
oldest
votes
$begingroup$
Of your two links, the second link (Expected number of different colors)
is the one that is more similar to yours.
The two differences are that it assumes drawing with replacement
(whereas you are drawing without replacement) and that it allows filling the bin with unequal numbers of balls of different colors.
We can still apply the basic techniques of that problem's solution:
indicator variables, and the linearity of expectation.
We assign a "color number" to each color that occurs in the urn.
For example, if the colors of balls in the urn are red, green, and black,
we might say red is color $1,$ green is color $2,$ and black is color $3.$
Then, for a single drawing, define $B_r$ as the number of balls drawn from the balls of color number $r$.
Next, we define the notation $[B_r > 0]$ so that $[B_r > 0] = 1$ if $B_r > 0$
and $[B_r > 0] = 0$ if $B_r = 0.$
The expression $[B_r > 0]$ is an indicator variable that says whether any balls of color $r$ were drawn.
Then
begin{align}
E([B_r > 0]) &= P([B_r > 0] = 1)cdot 1 + P([B_r > 0] = 0) cdot 0 \
&= P(B_r > 0) \
&= 1 - P(B_r = 0).
end{align}
To evaluate $P(B_r = 0),$ the probability that no balls of color $r$ are drawn,
we use the assumption that each $m$-ball subset of the $i$ balls in the urn is equally likely to be drawn as any other $m$-ball subset of the $i$ balls.
(You do assume that, right?)
There are $binom im$ subsets that contain $m$ balls, and of these,
$binom {i - k}m$ subsets that do not contain any balls of color $r.$
Therefore
begin{align}
P(B_r = 0) &= frac{binom {i - k}m}{binom im} \
&= frac{(i - k)!(i - m)!}{i!(i-m-k)!}\
&= frac{(i-m)(i - m - 1)cdots(i - m - k + 1)}{i(i-1)cdots(i - m + 1)}.
end{align}
That's three ways to write the formula for this probability; I like the first way best so I will continue writing it that way.
We therefore can write the expected value of $[B_r > 0]$,
$$
E([B_r > 0]) = 1 - frac{binom {i - k}m}{binom im},
$$
and since expectation is linear, we can add the expected values of
$[B_1 > 0],$ $[B_2 > 0],$ etc., up through $[B_j > 0]$ to get the
expected value of the sum
$[B_1 > 0] + [B_2 > 0] + [B_j > 0]$,
which literally counts the number of colors that were drawn:
begin{align}
Eleft(sum_{r=1}^j [B_r > 0]right)
&= sum_{r=1}^j E([B_r > 0]) \
&= sum_{r=1}^j left(1 - frac{binom {i - k}m}{binom im}right) \
&= j left(1 - frac{binom {i - k}m}{binom im}right).
end{align}
That's the expected number of colors in one drawing.
So that fact that you drew without replacement makes the calculation slightly more complicated
(if you consider ${binom {i - k}m}/{binom im}$ more complicated
than $left(1 - frac 1jright)^m$),
but the fact that you have the same number of balls of each color makes the problem simpler, because we get to multiply by $j$ instead of having to write the final answer in the $sum$ format.
Now when you make $n$ independent drawings, replacing all the balls after each drawing, again the linearity of expectation tells us that the expected average number of different colors per drawing is the same as the expected number of different colors in a single drawing.
The only thing that changes when we go to a large number of drawings is that
for a single drawing, it may be quite likely that you will see one or two colors more than expected or one or two colors fewer,
but with a large number of drawings, the average number of colors is not so likely to be much different from the expected value.
In other words, the expected value computed for a single drawing is the long-run average that you asked for.
$endgroup$
add a comment |
$begingroup$
Of your two links, the second link (Expected number of different colors)
is the one that is more similar to yours.
The two differences are that it assumes drawing with replacement
(whereas you are drawing without replacement) and that it allows filling the bin with unequal numbers of balls of different colors.
We can still apply the basic techniques of that problem's solution:
indicator variables, and the linearity of expectation.
We assign a "color number" to each color that occurs in the urn.
For example, if the colors of balls in the urn are red, green, and black,
we might say red is color $1,$ green is color $2,$ and black is color $3.$
Then, for a single drawing, define $B_r$ as the number of balls drawn from the balls of color number $r$.
Next, we define the notation $[B_r > 0]$ so that $[B_r > 0] = 1$ if $B_r > 0$
and $[B_r > 0] = 0$ if $B_r = 0.$
The expression $[B_r > 0]$ is an indicator variable that says whether any balls of color $r$ were drawn.
Then
begin{align}
E([B_r > 0]) &= P([B_r > 0] = 1)cdot 1 + P([B_r > 0] = 0) cdot 0 \
&= P(B_r > 0) \
&= 1 - P(B_r = 0).
end{align}
To evaluate $P(B_r = 0),$ the probability that no balls of color $r$ are drawn,
we use the assumption that each $m$-ball subset of the $i$ balls in the urn is equally likely to be drawn as any other $m$-ball subset of the $i$ balls.
(You do assume that, right?)
There are $binom im$ subsets that contain $m$ balls, and of these,
$binom {i - k}m$ subsets that do not contain any balls of color $r.$
Therefore
begin{align}
P(B_r = 0) &= frac{binom {i - k}m}{binom im} \
&= frac{(i - k)!(i - m)!}{i!(i-m-k)!}\
&= frac{(i-m)(i - m - 1)cdots(i - m - k + 1)}{i(i-1)cdots(i - m + 1)}.
end{align}
That's three ways to write the formula for this probability; I like the first way best so I will continue writing it that way.
We therefore can write the expected value of $[B_r > 0]$,
$$
E([B_r > 0]) = 1 - frac{binom {i - k}m}{binom im},
$$
and since expectation is linear, we can add the expected values of
$[B_1 > 0],$ $[B_2 > 0],$ etc., up through $[B_j > 0]$ to get the
expected value of the sum
$[B_1 > 0] + [B_2 > 0] + [B_j > 0]$,
which literally counts the number of colors that were drawn:
begin{align}
Eleft(sum_{r=1}^j [B_r > 0]right)
&= sum_{r=1}^j E([B_r > 0]) \
&= sum_{r=1}^j left(1 - frac{binom {i - k}m}{binom im}right) \
&= j left(1 - frac{binom {i - k}m}{binom im}right).
end{align}
That's the expected number of colors in one drawing.
So that fact that you drew without replacement makes the calculation slightly more complicated
(if you consider ${binom {i - k}m}/{binom im}$ more complicated
than $left(1 - frac 1jright)^m$),
but the fact that you have the same number of balls of each color makes the problem simpler, because we get to multiply by $j$ instead of having to write the final answer in the $sum$ format.
Now when you make $n$ independent drawings, replacing all the balls after each drawing, again the linearity of expectation tells us that the expected average number of different colors per drawing is the same as the expected number of different colors in a single drawing.
The only thing that changes when we go to a large number of drawings is that
for a single drawing, it may be quite likely that you will see one or two colors more than expected or one or two colors fewer,
but with a large number of drawings, the average number of colors is not so likely to be much different from the expected value.
In other words, the expected value computed for a single drawing is the long-run average that you asked for.
$endgroup$
add a comment |
$begingroup$
Of your two links, the second link (Expected number of different colors)
is the one that is more similar to yours.
The two differences are that it assumes drawing with replacement
(whereas you are drawing without replacement) and that it allows filling the bin with unequal numbers of balls of different colors.
We can still apply the basic techniques of that problem's solution:
indicator variables, and the linearity of expectation.
We assign a "color number" to each color that occurs in the urn.
For example, if the colors of balls in the urn are red, green, and black,
we might say red is color $1,$ green is color $2,$ and black is color $3.$
Then, for a single drawing, define $B_r$ as the number of balls drawn from the balls of color number $r$.
Next, we define the notation $[B_r > 0]$ so that $[B_r > 0] = 1$ if $B_r > 0$
and $[B_r > 0] = 0$ if $B_r = 0.$
The expression $[B_r > 0]$ is an indicator variable that says whether any balls of color $r$ were drawn.
Then
begin{align}
E([B_r > 0]) &= P([B_r > 0] = 1)cdot 1 + P([B_r > 0] = 0) cdot 0 \
&= P(B_r > 0) \
&= 1 - P(B_r = 0).
end{align}
To evaluate $P(B_r = 0),$ the probability that no balls of color $r$ are drawn,
we use the assumption that each $m$-ball subset of the $i$ balls in the urn is equally likely to be drawn as any other $m$-ball subset of the $i$ balls.
(You do assume that, right?)
There are $binom im$ subsets that contain $m$ balls, and of these,
$binom {i - k}m$ subsets that do not contain any balls of color $r.$
Therefore
begin{align}
P(B_r = 0) &= frac{binom {i - k}m}{binom im} \
&= frac{(i - k)!(i - m)!}{i!(i-m-k)!}\
&= frac{(i-m)(i - m - 1)cdots(i - m - k + 1)}{i(i-1)cdots(i - m + 1)}.
end{align}
That's three ways to write the formula for this probability; I like the first way best so I will continue writing it that way.
We therefore can write the expected value of $[B_r > 0]$,
$$
E([B_r > 0]) = 1 - frac{binom {i - k}m}{binom im},
$$
and since expectation is linear, we can add the expected values of
$[B_1 > 0],$ $[B_2 > 0],$ etc., up through $[B_j > 0]$ to get the
expected value of the sum
$[B_1 > 0] + [B_2 > 0] + [B_j > 0]$,
which literally counts the number of colors that were drawn:
begin{align}
Eleft(sum_{r=1}^j [B_r > 0]right)
&= sum_{r=1}^j E([B_r > 0]) \
&= sum_{r=1}^j left(1 - frac{binom {i - k}m}{binom im}right) \
&= j left(1 - frac{binom {i - k}m}{binom im}right).
end{align}
That's the expected number of colors in one drawing.
So that fact that you drew without replacement makes the calculation slightly more complicated
(if you consider ${binom {i - k}m}/{binom im}$ more complicated
than $left(1 - frac 1jright)^m$),
but the fact that you have the same number of balls of each color makes the problem simpler, because we get to multiply by $j$ instead of having to write the final answer in the $sum$ format.
Now when you make $n$ independent drawings, replacing all the balls after each drawing, again the linearity of expectation tells us that the expected average number of different colors per drawing is the same as the expected number of different colors in a single drawing.
The only thing that changes when we go to a large number of drawings is that
for a single drawing, it may be quite likely that you will see one or two colors more than expected or one or two colors fewer,
but with a large number of drawings, the average number of colors is not so likely to be much different from the expected value.
In other words, the expected value computed for a single drawing is the long-run average that you asked for.
$endgroup$
Of your two links, the second link (Expected number of different colors)
is the one that is more similar to yours.
The two differences are that it assumes drawing with replacement
(whereas you are drawing without replacement) and that it allows filling the bin with unequal numbers of balls of different colors.
We can still apply the basic techniques of that problem's solution:
indicator variables, and the linearity of expectation.
We assign a "color number" to each color that occurs in the urn.
For example, if the colors of balls in the urn are red, green, and black,
we might say red is color $1,$ green is color $2,$ and black is color $3.$
Then, for a single drawing, define $B_r$ as the number of balls drawn from the balls of color number $r$.
Next, we define the notation $[B_r > 0]$ so that $[B_r > 0] = 1$ if $B_r > 0$
and $[B_r > 0] = 0$ if $B_r = 0.$
The expression $[B_r > 0]$ is an indicator variable that says whether any balls of color $r$ were drawn.
Then
begin{align}
E([B_r > 0]) &= P([B_r > 0] = 1)cdot 1 + P([B_r > 0] = 0) cdot 0 \
&= P(B_r > 0) \
&= 1 - P(B_r = 0).
end{align}
To evaluate $P(B_r = 0),$ the probability that no balls of color $r$ are drawn,
we use the assumption that each $m$-ball subset of the $i$ balls in the urn is equally likely to be drawn as any other $m$-ball subset of the $i$ balls.
(You do assume that, right?)
There are $binom im$ subsets that contain $m$ balls, and of these,
$binom {i - k}m$ subsets that do not contain any balls of color $r.$
Therefore
begin{align}
P(B_r = 0) &= frac{binom {i - k}m}{binom im} \
&= frac{(i - k)!(i - m)!}{i!(i-m-k)!}\
&= frac{(i-m)(i - m - 1)cdots(i - m - k + 1)}{i(i-1)cdots(i - m + 1)}.
end{align}
That's three ways to write the formula for this probability; I like the first way best so I will continue writing it that way.
We therefore can write the expected value of $[B_r > 0]$,
$$
E([B_r > 0]) = 1 - frac{binom {i - k}m}{binom im},
$$
and since expectation is linear, we can add the expected values of
$[B_1 > 0],$ $[B_2 > 0],$ etc., up through $[B_j > 0]$ to get the
expected value of the sum
$[B_1 > 0] + [B_2 > 0] + [B_j > 0]$,
which literally counts the number of colors that were drawn:
begin{align}
Eleft(sum_{r=1}^j [B_r > 0]right)
&= sum_{r=1}^j E([B_r > 0]) \
&= sum_{r=1}^j left(1 - frac{binom {i - k}m}{binom im}right) \
&= j left(1 - frac{binom {i - k}m}{binom im}right).
end{align}
That's the expected number of colors in one drawing.
So that fact that you drew without replacement makes the calculation slightly more complicated
(if you consider ${binom {i - k}m}/{binom im}$ more complicated
than $left(1 - frac 1jright)^m$),
but the fact that you have the same number of balls of each color makes the problem simpler, because we get to multiply by $j$ instead of having to write the final answer in the $sum$ format.
Now when you make $n$ independent drawings, replacing all the balls after each drawing, again the linearity of expectation tells us that the expected average number of different colors per drawing is the same as the expected number of different colors in a single drawing.
The only thing that changes when we go to a large number of drawings is that
for a single drawing, it may be quite likely that you will see one or two colors more than expected or one or two colors fewer,
but with a large number of drawings, the average number of colors is not so likely to be much different from the expected value.
In other words, the expected value computed for a single drawing is the long-run average that you asked for.
edited Jan 20 at 20:29
answered Jan 20 at 19:41
David KDavid K
54.8k343120
54.8k343120
add a comment |
add a comment |
Thanks for contributing an answer to Mathematics Stack Exchange!
- Please be sure to answer the question. Provide details and share your research!
But avoid …
- Asking for help, clarification, or responding to other answers.
- Making statements based on opinion; back them up with references or personal experience.
Use MathJax to format equations. MathJax reference.
To learn more, see our tips on writing great answers.
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fmath.stackexchange.com%2fquestions%2f3079698%2furn-problem-probability-of-m-drawing-balls-for-n-simulation%23new-answer', 'question_page');
}
);
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
$begingroup$
When picking m balls, are the balls picked kept out or put back after each pick during a simulation?
$endgroup$
– herb steinberg
Jan 19 at 20:54
$begingroup$
Hi thanks for the response, picked balls will be kept out.
$endgroup$
– Chitrang
Jan 19 at 21:12
$begingroup$
A minor stylistic note: I found it very confusing to have $j$ represent the number of colors and $i$ the total number of balls, because $i$ and $j$ are so much more commonly used as iteration variables, for example in $sum_{i = 1}^n,$ than as the total number of something. It's kind of like having someone graph a function $y = f(x)$ on a graph where the $y$ axis is horizontal and the $x$ axis is vertical. It's actually logically sound, but one has to keep reminding oneself not to assume the usual conventions.
$endgroup$
– David K
Jan 20 at 20:37