Sampling distribution of sample trimmed (truncated) mean
$begingroup$
It is elementary probability theory that the sample mean of an i.i.d. sample follows normal distribution, if the background distribution is normal. But what about the trimmed mean? Is there any result on its distribution for an i.i.d. sample of size $n$? (For normal or general population distribution.)
My only idea is to use the results for the distribution of order statistics (summing them, taking their non-independence into account), but it seems exceedingly complicated, perhaps there is an easier way...
probability statistics probability-distributions statistical-inference order-statistics
$endgroup$
add a comment |
$begingroup$
It is elementary probability theory that the sample mean of an i.i.d. sample follows normal distribution, if the background distribution is normal. But what about the trimmed mean? Is there any result on its distribution for an i.i.d. sample of size $n$? (For normal or general population distribution.)
My only idea is to use the results for the distribution of order statistics (summing them, taking their non-independence into account), but it seems exceedingly complicated, perhaps there is an easier way...
probability statistics probability-distributions statistical-inference order-statistics
$endgroup$
$begingroup$
Tamás! What is the trimmed mean?
$endgroup$
– zoli
Jun 3 '15 at 12:41
2
$begingroup$
@zoli: You take the sample, drop some of the smallest and highest observations (say, the 10 smallest and 10 largest in a sample of 100 - 10% trimmed mean) and take the mean of the remaining (80 in the example) observations. It is sometimes also called truncated mean.
$endgroup$
– Tamas Ferenci
Jun 3 '15 at 12:53
$begingroup$
@zoli: See as much of my Answer as interests you.
$endgroup$
– BruceET
Jun 5 '15 at 3:20
add a comment |
$begingroup$
It is elementary probability theory that the sample mean of an i.i.d. sample follows normal distribution, if the background distribution is normal. But what about the trimmed mean? Is there any result on its distribution for an i.i.d. sample of size $n$? (For normal or general population distribution.)
My only idea is to use the results for the distribution of order statistics (summing them, taking their non-independence into account), but it seems exceedingly complicated, perhaps there is an easier way...
probability statistics probability-distributions statistical-inference order-statistics
$endgroup$
It is elementary probability theory that the sample mean of an i.i.d. sample follows normal distribution, if the background distribution is normal. But what about the trimmed mean? Is there any result on its distribution for an i.i.d. sample of size $n$? (For normal or general population distribution.)
My only idea is to use the results for the distribution of order statistics (summing them, taking their non-independence into account), but it seems exceedingly complicated, perhaps there is an easier way...
probability statistics probability-distributions statistical-inference order-statistics
probability statistics probability-distributions statistical-inference order-statistics
edited Jun 4 '15 at 20:18
Tamas Ferenci
asked Jun 3 '15 at 11:36
Tamas FerenciTamas Ferenci
16818
16818
$begingroup$
Tamás! What is the trimmed mean?
$endgroup$
– zoli
Jun 3 '15 at 12:41
2
$begingroup$
@zoli: You take the sample, drop some of the smallest and highest observations (say, the 10 smallest and 10 largest in a sample of 100 - 10% trimmed mean) and take the mean of the remaining (80 in the example) observations. It is sometimes also called truncated mean.
$endgroup$
– Tamas Ferenci
Jun 3 '15 at 12:53
$begingroup$
@zoli: See as much of my Answer as interests you.
$endgroup$
– BruceET
Jun 5 '15 at 3:20
add a comment |
$begingroup$
Tamás! What is the trimmed mean?
$endgroup$
– zoli
Jun 3 '15 at 12:41
2
$begingroup$
@zoli: You take the sample, drop some of the smallest and highest observations (say, the 10 smallest and 10 largest in a sample of 100 - 10% trimmed mean) and take the mean of the remaining (80 in the example) observations. It is sometimes also called truncated mean.
$endgroup$
– Tamas Ferenci
Jun 3 '15 at 12:53
$begingroup$
@zoli: See as much of my Answer as interests you.
$endgroup$
– BruceET
Jun 5 '15 at 3:20
$begingroup$
Tamás! What is the trimmed mean?
$endgroup$
– zoli
Jun 3 '15 at 12:41
$begingroup$
Tamás! What is the trimmed mean?
$endgroup$
– zoli
Jun 3 '15 at 12:41
2
2
$begingroup$
@zoli: You take the sample, drop some of the smallest and highest observations (say, the 10 smallest and 10 largest in a sample of 100 - 10% trimmed mean) and take the mean of the remaining (80 in the example) observations. It is sometimes also called truncated mean.
$endgroup$
– Tamas Ferenci
Jun 3 '15 at 12:53
$begingroup$
@zoli: You take the sample, drop some of the smallest and highest observations (say, the 10 smallest and 10 largest in a sample of 100 - 10% trimmed mean) and take the mean of the remaining (80 in the example) observations. It is sometimes also called truncated mean.
$endgroup$
– Tamas Ferenci
Jun 3 '15 at 12:53
$begingroup$
@zoli: See as much of my Answer as interests you.
$endgroup$
– BruceET
Jun 5 '15 at 3:20
$begingroup$
@zoli: See as much of my Answer as interests you.
$endgroup$
– BruceET
Jun 5 '15 at 3:20
add a comment |
1 Answer
1
active
oldest
votes
$begingroup$
You are correct that the distribution theory is of an advanced nature.
An important paper on this topic is Stephen M. Stigler, Annals of Statistics, Vol. 1, No. 3 (1973); an open version is here. This other Stigler paper is also relevant.
However, in general terms, if the the population distribution is normal--or any other continuous unimodal distribution that has a mean and variance and that decreases monotonically towards its tail(s)--the trimmed mean
converges to normal as the sample size increases. (The condition I
have given can be weakened, but it covers a vast majority of
distributions used in practical modeling.)
Various versions of the trimmed mean eliminate different percentages
of their observations from both tails. A common choice is the 5% trimmed mean that cuts 5% from each tail and averages the 'middle'
90% of the data. As trimming approaches 50% from each tail, the trimmed mean becomes the median.
The degree of trimming may affect the rate at which normality
is reached, but the tendency is nevertheless towards normal.
There is even a 'central limit theorem' for medians.
When there is symmetry all around (symmetrical population
distribution and trimming the same percentage from each tail)
the expectation of the trimmed mean is the same as the population mean. The variance depends on percentage of trimming, shape of
the population distribution and sample size.
Because of the messiness of the distribution theory, it is common in practice to do simulation studies to determine
the distribution of the trimmed mean in a particular situation.
For example, suppose the parent distribution is a mixture
of 90% $Norm(100, sigma = 10)$ and 10% $Norm(130, sigma=50),$ and
we have a sample of size $n=20.$
The usual terminology is that the population with mean 100 has been
"10% contaminated" by observations with a larger mean and standard deviation. Ten percent is a fairly high level of contamination.
The contaminated distribution is far from normal, with very
heavy tails, and right skewness.
A simple simulation with 100,000 samples of size 20 shows that $E(bar X) = 103$ and $SD(bar X) = 18.7$ for the original data.
For the trimmed data (denoted by $Y$) we have
$E(bar Y) = 101.6$ and $SD(bar Y) = 11.3.$
Histograms of both $bar X$ and $bar Y$ are "nearly" normal,
even with the relatively small sample size $n = 20$, but both
are slightly skewed to the right.
Trimming tends to put $bar Y$
closer to the mean 100 of the 'main' population than is true for
the untrimmed mean. Similarly, trimming has eliminated some,
but not all of the 'excess' standard deviation due to
contamination. We see that 5% trimming has partly mitigated the
effects of serious contamination, but hardly completely.
$endgroup$
$begingroup$
Thanks for your exhaustive and very informative answer! I have only one - but unfortunately profound - problem: I'm interested not in the asymptotic, but in the finite sample distribution. Say, the sampling distribution when $n=10$. The simulation approach is feasible, I tried it myself, but I hoped that there are analytical results in this topic (just I'm not aware of them).
$endgroup$
– Tamas Ferenci
Jun 5 '15 at 13:11
$begingroup$
I am not aware of results or methods for such analytic solutions. In practice, the trimmed mean is most appropriately used when an unusual or contaminated population distribution is suspected, so the possibilities are quite ill-defined and broad. Also, simple and appealing as the trimming might seem, I believe there are much better, certainly more popular, 'robust', nonparametric, and computer-intensive methods available nowadays. As evidence, several stat comp pkgs have taken trimmed means off their default menus.
$endgroup$
– BruceET
Jun 5 '15 at 17:27
$begingroup$
Continuation: Trimming does not make much sense when $n=10$ because the most minimal trimming would leave you with only eight obs. I chose 20 for my exmp because it is really the min sensible case. For short-tailed distributions (as result from trimming), convergence to near normal is rapid. For example, the mean of 10 obs from a uniform dist'n is almost undetectably different from normal. Asymmetrical cases converge more slowly
$endgroup$
– BruceET
Jun 5 '15 at 17:39
$begingroup$
You're absolutely right. It is my fault: I wanted to make the question as succint as possible, so I omitted the context. Let me add it now.
$endgroup$
– Tamas Ferenci
Jun 5 '15 at 22:45
$begingroup$
I aimed to come up with a very-very-very simple model for publication bias. You take a background distribution of $N(0,sigma^2)$ (the drug is not effective). You take a sample of size $n$ (trials of equal size performed), its average (metaanalysis) is also normal with zero mean. But - now comes the important part - you drop the minimum observation: the worst trial is not performed. What will be the distribution of your metaanalysis in this case? And this is practically a trimmed mean, that's why I asked it, but with asymmetric trimming.
$endgroup$
– Tamas Ferenci
Jun 5 '15 at 22:45
|
show 3 more comments
Your Answer
StackExchange.ifUsing("editor", function () {
return StackExchange.using("mathjaxEditing", function () {
StackExchange.MarkdownEditor.creationCallbacks.add(function (editor, postfix) {
StackExchange.mathjaxEditing.prepareWmdForMathJax(editor, postfix, [["$", "$"], ["\\(","\\)"]]);
});
});
}, "mathjax-editing");
StackExchange.ready(function() {
var channelOptions = {
tags: "".split(" "),
id: "69"
};
initTagRenderer("".split(" "), "".split(" "), channelOptions);
StackExchange.using("externalEditor", function() {
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled) {
StackExchange.using("snippets", function() {
createEditor();
});
}
else {
createEditor();
}
});
function createEditor() {
StackExchange.prepareEditor({
heartbeatType: 'answer',
autoActivateHeartbeat: false,
convertImagesToLinks: true,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: 10,
bindNavPrevention: true,
postfix: "",
imageUploader: {
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
},
noCode: true, onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
});
}
});
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fmath.stackexchange.com%2fquestions%2f1310518%2fsampling-distribution-of-sample-trimmed-truncated-mean%23new-answer', 'question_page');
}
);
Post as a guest
Required, but never shown
1 Answer
1
active
oldest
votes
1 Answer
1
active
oldest
votes
active
oldest
votes
active
oldest
votes
$begingroup$
You are correct that the distribution theory is of an advanced nature.
An important paper on this topic is Stephen M. Stigler, Annals of Statistics, Vol. 1, No. 3 (1973); an open version is here. This other Stigler paper is also relevant.
However, in general terms, if the the population distribution is normal--or any other continuous unimodal distribution that has a mean and variance and that decreases monotonically towards its tail(s)--the trimmed mean
converges to normal as the sample size increases. (The condition I
have given can be weakened, but it covers a vast majority of
distributions used in practical modeling.)
Various versions of the trimmed mean eliminate different percentages
of their observations from both tails. A common choice is the 5% trimmed mean that cuts 5% from each tail and averages the 'middle'
90% of the data. As trimming approaches 50% from each tail, the trimmed mean becomes the median.
The degree of trimming may affect the rate at which normality
is reached, but the tendency is nevertheless towards normal.
There is even a 'central limit theorem' for medians.
When there is symmetry all around (symmetrical population
distribution and trimming the same percentage from each tail)
the expectation of the trimmed mean is the same as the population mean. The variance depends on percentage of trimming, shape of
the population distribution and sample size.
Because of the messiness of the distribution theory, it is common in practice to do simulation studies to determine
the distribution of the trimmed mean in a particular situation.
For example, suppose the parent distribution is a mixture
of 90% $Norm(100, sigma = 10)$ and 10% $Norm(130, sigma=50),$ and
we have a sample of size $n=20.$
The usual terminology is that the population with mean 100 has been
"10% contaminated" by observations with a larger mean and standard deviation. Ten percent is a fairly high level of contamination.
The contaminated distribution is far from normal, with very
heavy tails, and right skewness.
A simple simulation with 100,000 samples of size 20 shows that $E(bar X) = 103$ and $SD(bar X) = 18.7$ for the original data.
For the trimmed data (denoted by $Y$) we have
$E(bar Y) = 101.6$ and $SD(bar Y) = 11.3.$
Histograms of both $bar X$ and $bar Y$ are "nearly" normal,
even with the relatively small sample size $n = 20$, but both
are slightly skewed to the right.
Trimming tends to put $bar Y$
closer to the mean 100 of the 'main' population than is true for
the untrimmed mean. Similarly, trimming has eliminated some,
but not all of the 'excess' standard deviation due to
contamination. We see that 5% trimming has partly mitigated the
effects of serious contamination, but hardly completely.
$endgroup$
$begingroup$
Thanks for your exhaustive and very informative answer! I have only one - but unfortunately profound - problem: I'm interested not in the asymptotic, but in the finite sample distribution. Say, the sampling distribution when $n=10$. The simulation approach is feasible, I tried it myself, but I hoped that there are analytical results in this topic (just I'm not aware of them).
$endgroup$
– Tamas Ferenci
Jun 5 '15 at 13:11
$begingroup$
I am not aware of results or methods for such analytic solutions. In practice, the trimmed mean is most appropriately used when an unusual or contaminated population distribution is suspected, so the possibilities are quite ill-defined and broad. Also, simple and appealing as the trimming might seem, I believe there are much better, certainly more popular, 'robust', nonparametric, and computer-intensive methods available nowadays. As evidence, several stat comp pkgs have taken trimmed means off their default menus.
$endgroup$
– BruceET
Jun 5 '15 at 17:27
$begingroup$
Continuation: Trimming does not make much sense when $n=10$ because the most minimal trimming would leave you with only eight obs. I chose 20 for my exmp because it is really the min sensible case. For short-tailed distributions (as result from trimming), convergence to near normal is rapid. For example, the mean of 10 obs from a uniform dist'n is almost undetectably different from normal. Asymmetrical cases converge more slowly
$endgroup$
– BruceET
Jun 5 '15 at 17:39
$begingroup$
You're absolutely right. It is my fault: I wanted to make the question as succint as possible, so I omitted the context. Let me add it now.
$endgroup$
– Tamas Ferenci
Jun 5 '15 at 22:45
$begingroup$
I aimed to come up with a very-very-very simple model for publication bias. You take a background distribution of $N(0,sigma^2)$ (the drug is not effective). You take a sample of size $n$ (trials of equal size performed), its average (metaanalysis) is also normal with zero mean. But - now comes the important part - you drop the minimum observation: the worst trial is not performed. What will be the distribution of your metaanalysis in this case? And this is practically a trimmed mean, that's why I asked it, but with asymmetric trimming.
$endgroup$
– Tamas Ferenci
Jun 5 '15 at 22:45
|
show 3 more comments
$begingroup$
You are correct that the distribution theory is of an advanced nature.
An important paper on this topic is Stephen M. Stigler, Annals of Statistics, Vol. 1, No. 3 (1973); an open version is here. This other Stigler paper is also relevant.
However, in general terms, if the the population distribution is normal--or any other continuous unimodal distribution that has a mean and variance and that decreases monotonically towards its tail(s)--the trimmed mean
converges to normal as the sample size increases. (The condition I
have given can be weakened, but it covers a vast majority of
distributions used in practical modeling.)
Various versions of the trimmed mean eliminate different percentages
of their observations from both tails. A common choice is the 5% trimmed mean that cuts 5% from each tail and averages the 'middle'
90% of the data. As trimming approaches 50% from each tail, the trimmed mean becomes the median.
The degree of trimming may affect the rate at which normality
is reached, but the tendency is nevertheless towards normal.
There is even a 'central limit theorem' for medians.
When there is symmetry all around (symmetrical population
distribution and trimming the same percentage from each tail)
the expectation of the trimmed mean is the same as the population mean. The variance depends on percentage of trimming, shape of
the population distribution and sample size.
Because of the messiness of the distribution theory, it is common in practice to do simulation studies to determine
the distribution of the trimmed mean in a particular situation.
For example, suppose the parent distribution is a mixture
of 90% $Norm(100, sigma = 10)$ and 10% $Norm(130, sigma=50),$ and
we have a sample of size $n=20.$
The usual terminology is that the population with mean 100 has been
"10% contaminated" by observations with a larger mean and standard deviation. Ten percent is a fairly high level of contamination.
The contaminated distribution is far from normal, with very
heavy tails, and right skewness.
A simple simulation with 100,000 samples of size 20 shows that $E(bar X) = 103$ and $SD(bar X) = 18.7$ for the original data.
For the trimmed data (denoted by $Y$) we have
$E(bar Y) = 101.6$ and $SD(bar Y) = 11.3.$
Histograms of both $bar X$ and $bar Y$ are "nearly" normal,
even with the relatively small sample size $n = 20$, but both
are slightly skewed to the right.
Trimming tends to put $bar Y$
closer to the mean 100 of the 'main' population than is true for
the untrimmed mean. Similarly, trimming has eliminated some,
but not all of the 'excess' standard deviation due to
contamination. We see that 5% trimming has partly mitigated the
effects of serious contamination, but hardly completely.
$endgroup$
$begingroup$
Thanks for your exhaustive and very informative answer! I have only one - but unfortunately profound - problem: I'm interested not in the asymptotic, but in the finite sample distribution. Say, the sampling distribution when $n=10$. The simulation approach is feasible, I tried it myself, but I hoped that there are analytical results in this topic (just I'm not aware of them).
$endgroup$
– Tamas Ferenci
Jun 5 '15 at 13:11
$begingroup$
I am not aware of results or methods for such analytic solutions. In practice, the trimmed mean is most appropriately used when an unusual or contaminated population distribution is suspected, so the possibilities are quite ill-defined and broad. Also, simple and appealing as the trimming might seem, I believe there are much better, certainly more popular, 'robust', nonparametric, and computer-intensive methods available nowadays. As evidence, several stat comp pkgs have taken trimmed means off their default menus.
$endgroup$
– BruceET
Jun 5 '15 at 17:27
$begingroup$
Continuation: Trimming does not make much sense when $n=10$ because the most minimal trimming would leave you with only eight obs. I chose 20 for my exmp because it is really the min sensible case. For short-tailed distributions (as result from trimming), convergence to near normal is rapid. For example, the mean of 10 obs from a uniform dist'n is almost undetectably different from normal. Asymmetrical cases converge more slowly
$endgroup$
– BruceET
Jun 5 '15 at 17:39
$begingroup$
You're absolutely right. It is my fault: I wanted to make the question as succint as possible, so I omitted the context. Let me add it now.
$endgroup$
– Tamas Ferenci
Jun 5 '15 at 22:45
$begingroup$
I aimed to come up with a very-very-very simple model for publication bias. You take a background distribution of $N(0,sigma^2)$ (the drug is not effective). You take a sample of size $n$ (trials of equal size performed), its average (metaanalysis) is also normal with zero mean. But - now comes the important part - you drop the minimum observation: the worst trial is not performed. What will be the distribution of your metaanalysis in this case? And this is practically a trimmed mean, that's why I asked it, but with asymmetric trimming.
$endgroup$
– Tamas Ferenci
Jun 5 '15 at 22:45
|
show 3 more comments
$begingroup$
You are correct that the distribution theory is of an advanced nature.
An important paper on this topic is Stephen M. Stigler, Annals of Statistics, Vol. 1, No. 3 (1973); an open version is here. This other Stigler paper is also relevant.
However, in general terms, if the the population distribution is normal--or any other continuous unimodal distribution that has a mean and variance and that decreases monotonically towards its tail(s)--the trimmed mean
converges to normal as the sample size increases. (The condition I
have given can be weakened, but it covers a vast majority of
distributions used in practical modeling.)
Various versions of the trimmed mean eliminate different percentages
of their observations from both tails. A common choice is the 5% trimmed mean that cuts 5% from each tail and averages the 'middle'
90% of the data. As trimming approaches 50% from each tail, the trimmed mean becomes the median.
The degree of trimming may affect the rate at which normality
is reached, but the tendency is nevertheless towards normal.
There is even a 'central limit theorem' for medians.
When there is symmetry all around (symmetrical population
distribution and trimming the same percentage from each tail)
the expectation of the trimmed mean is the same as the population mean. The variance depends on percentage of trimming, shape of
the population distribution and sample size.
Because of the messiness of the distribution theory, it is common in practice to do simulation studies to determine
the distribution of the trimmed mean in a particular situation.
For example, suppose the parent distribution is a mixture
of 90% $Norm(100, sigma = 10)$ and 10% $Norm(130, sigma=50),$ and
we have a sample of size $n=20.$
The usual terminology is that the population with mean 100 has been
"10% contaminated" by observations with a larger mean and standard deviation. Ten percent is a fairly high level of contamination.
The contaminated distribution is far from normal, with very
heavy tails, and right skewness.
A simple simulation with 100,000 samples of size 20 shows that $E(bar X) = 103$ and $SD(bar X) = 18.7$ for the original data.
For the trimmed data (denoted by $Y$) we have
$E(bar Y) = 101.6$ and $SD(bar Y) = 11.3.$
Histograms of both $bar X$ and $bar Y$ are "nearly" normal,
even with the relatively small sample size $n = 20$, but both
are slightly skewed to the right.
Trimming tends to put $bar Y$
closer to the mean 100 of the 'main' population than is true for
the untrimmed mean. Similarly, trimming has eliminated some,
but not all of the 'excess' standard deviation due to
contamination. We see that 5% trimming has partly mitigated the
effects of serious contamination, but hardly completely.
$endgroup$
You are correct that the distribution theory is of an advanced nature.
An important paper on this topic is Stephen M. Stigler, Annals of Statistics, Vol. 1, No. 3 (1973); an open version is here. This other Stigler paper is also relevant.
However, in general terms, if the the population distribution is normal--or any other continuous unimodal distribution that has a mean and variance and that decreases monotonically towards its tail(s)--the trimmed mean
converges to normal as the sample size increases. (The condition I
have given can be weakened, but it covers a vast majority of
distributions used in practical modeling.)
Various versions of the trimmed mean eliminate different percentages
of their observations from both tails. A common choice is the 5% trimmed mean that cuts 5% from each tail and averages the 'middle'
90% of the data. As trimming approaches 50% from each tail, the trimmed mean becomes the median.
The degree of trimming may affect the rate at which normality
is reached, but the tendency is nevertheless towards normal.
There is even a 'central limit theorem' for medians.
When there is symmetry all around (symmetrical population
distribution and trimming the same percentage from each tail)
the expectation of the trimmed mean is the same as the population mean. The variance depends on percentage of trimming, shape of
the population distribution and sample size.
Because of the messiness of the distribution theory, it is common in practice to do simulation studies to determine
the distribution of the trimmed mean in a particular situation.
For example, suppose the parent distribution is a mixture
of 90% $Norm(100, sigma = 10)$ and 10% $Norm(130, sigma=50),$ and
we have a sample of size $n=20.$
The usual terminology is that the population with mean 100 has been
"10% contaminated" by observations with a larger mean and standard deviation. Ten percent is a fairly high level of contamination.
The contaminated distribution is far from normal, with very
heavy tails, and right skewness.
A simple simulation with 100,000 samples of size 20 shows that $E(bar X) = 103$ and $SD(bar X) = 18.7$ for the original data.
For the trimmed data (denoted by $Y$) we have
$E(bar Y) = 101.6$ and $SD(bar Y) = 11.3.$
Histograms of both $bar X$ and $bar Y$ are "nearly" normal,
even with the relatively small sample size $n = 20$, but both
are slightly skewed to the right.
Trimming tends to put $bar Y$
closer to the mean 100 of the 'main' population than is true for
the untrimmed mean. Similarly, trimming has eliminated some,
but not all of the 'excess' standard deviation due to
contamination. We see that 5% trimming has partly mitigated the
effects of serious contamination, but hardly completely.
edited Jan 16 at 20:42


kjetil b halvorsen
4,78742638
4,78742638
answered Jun 5 '15 at 3:09
BruceETBruceET
35.7k71440
35.7k71440
$begingroup$
Thanks for your exhaustive and very informative answer! I have only one - but unfortunately profound - problem: I'm interested not in the asymptotic, but in the finite sample distribution. Say, the sampling distribution when $n=10$. The simulation approach is feasible, I tried it myself, but I hoped that there are analytical results in this topic (just I'm not aware of them).
$endgroup$
– Tamas Ferenci
Jun 5 '15 at 13:11
$begingroup$
I am not aware of results or methods for such analytic solutions. In practice, the trimmed mean is most appropriately used when an unusual or contaminated population distribution is suspected, so the possibilities are quite ill-defined and broad. Also, simple and appealing as the trimming might seem, I believe there are much better, certainly more popular, 'robust', nonparametric, and computer-intensive methods available nowadays. As evidence, several stat comp pkgs have taken trimmed means off their default menus.
$endgroup$
– BruceET
Jun 5 '15 at 17:27
$begingroup$
Continuation: Trimming does not make much sense when $n=10$ because the most minimal trimming would leave you with only eight obs. I chose 20 for my exmp because it is really the min sensible case. For short-tailed distributions (as result from trimming), convergence to near normal is rapid. For example, the mean of 10 obs from a uniform dist'n is almost undetectably different from normal. Asymmetrical cases converge more slowly
$endgroup$
– BruceET
Jun 5 '15 at 17:39
$begingroup$
You're absolutely right. It is my fault: I wanted to make the question as succint as possible, so I omitted the context. Let me add it now.
$endgroup$
– Tamas Ferenci
Jun 5 '15 at 22:45
$begingroup$
I aimed to come up with a very-very-very simple model for publication bias. You take a background distribution of $N(0,sigma^2)$ (the drug is not effective). You take a sample of size $n$ (trials of equal size performed), its average (metaanalysis) is also normal with zero mean. But - now comes the important part - you drop the minimum observation: the worst trial is not performed. What will be the distribution of your metaanalysis in this case? And this is practically a trimmed mean, that's why I asked it, but with asymmetric trimming.
$endgroup$
– Tamas Ferenci
Jun 5 '15 at 22:45
|
show 3 more comments
$begingroup$
Thanks for your exhaustive and very informative answer! I have only one - but unfortunately profound - problem: I'm interested not in the asymptotic, but in the finite sample distribution. Say, the sampling distribution when $n=10$. The simulation approach is feasible, I tried it myself, but I hoped that there are analytical results in this topic (just I'm not aware of them).
$endgroup$
– Tamas Ferenci
Jun 5 '15 at 13:11
$begingroup$
I am not aware of results or methods for such analytic solutions. In practice, the trimmed mean is most appropriately used when an unusual or contaminated population distribution is suspected, so the possibilities are quite ill-defined and broad. Also, simple and appealing as the trimming might seem, I believe there are much better, certainly more popular, 'robust', nonparametric, and computer-intensive methods available nowadays. As evidence, several stat comp pkgs have taken trimmed means off their default menus.
$endgroup$
– BruceET
Jun 5 '15 at 17:27
$begingroup$
Continuation: Trimming does not make much sense when $n=10$ because the most minimal trimming would leave you with only eight obs. I chose 20 for my exmp because it is really the min sensible case. For short-tailed distributions (as result from trimming), convergence to near normal is rapid. For example, the mean of 10 obs from a uniform dist'n is almost undetectably different from normal. Asymmetrical cases converge more slowly
$endgroup$
– BruceET
Jun 5 '15 at 17:39
$begingroup$
You're absolutely right. It is my fault: I wanted to make the question as succint as possible, so I omitted the context. Let me add it now.
$endgroup$
– Tamas Ferenci
Jun 5 '15 at 22:45
$begingroup$
I aimed to come up with a very-very-very simple model for publication bias. You take a background distribution of $N(0,sigma^2)$ (the drug is not effective). You take a sample of size $n$ (trials of equal size performed), its average (metaanalysis) is also normal with zero mean. But - now comes the important part - you drop the minimum observation: the worst trial is not performed. What will be the distribution of your metaanalysis in this case? And this is practically a trimmed mean, that's why I asked it, but with asymmetric trimming.
$endgroup$
– Tamas Ferenci
Jun 5 '15 at 22:45
$begingroup$
Thanks for your exhaustive and very informative answer! I have only one - but unfortunately profound - problem: I'm interested not in the asymptotic, but in the finite sample distribution. Say, the sampling distribution when $n=10$. The simulation approach is feasible, I tried it myself, but I hoped that there are analytical results in this topic (just I'm not aware of them).
$endgroup$
– Tamas Ferenci
Jun 5 '15 at 13:11
$begingroup$
Thanks for your exhaustive and very informative answer! I have only one - but unfortunately profound - problem: I'm interested not in the asymptotic, but in the finite sample distribution. Say, the sampling distribution when $n=10$. The simulation approach is feasible, I tried it myself, but I hoped that there are analytical results in this topic (just I'm not aware of them).
$endgroup$
– Tamas Ferenci
Jun 5 '15 at 13:11
$begingroup$
I am not aware of results or methods for such analytic solutions. In practice, the trimmed mean is most appropriately used when an unusual or contaminated population distribution is suspected, so the possibilities are quite ill-defined and broad. Also, simple and appealing as the trimming might seem, I believe there are much better, certainly more popular, 'robust', nonparametric, and computer-intensive methods available nowadays. As evidence, several stat comp pkgs have taken trimmed means off their default menus.
$endgroup$
– BruceET
Jun 5 '15 at 17:27
$begingroup$
I am not aware of results or methods for such analytic solutions. In practice, the trimmed mean is most appropriately used when an unusual or contaminated population distribution is suspected, so the possibilities are quite ill-defined and broad. Also, simple and appealing as the trimming might seem, I believe there are much better, certainly more popular, 'robust', nonparametric, and computer-intensive methods available nowadays. As evidence, several stat comp pkgs have taken trimmed means off their default menus.
$endgroup$
– BruceET
Jun 5 '15 at 17:27
$begingroup$
Continuation: Trimming does not make much sense when $n=10$ because the most minimal trimming would leave you with only eight obs. I chose 20 for my exmp because it is really the min sensible case. For short-tailed distributions (as result from trimming), convergence to near normal is rapid. For example, the mean of 10 obs from a uniform dist'n is almost undetectably different from normal. Asymmetrical cases converge more slowly
$endgroup$
– BruceET
Jun 5 '15 at 17:39
$begingroup$
Continuation: Trimming does not make much sense when $n=10$ because the most minimal trimming would leave you with only eight obs. I chose 20 for my exmp because it is really the min sensible case. For short-tailed distributions (as result from trimming), convergence to near normal is rapid. For example, the mean of 10 obs from a uniform dist'n is almost undetectably different from normal. Asymmetrical cases converge more slowly
$endgroup$
– BruceET
Jun 5 '15 at 17:39
$begingroup$
You're absolutely right. It is my fault: I wanted to make the question as succint as possible, so I omitted the context. Let me add it now.
$endgroup$
– Tamas Ferenci
Jun 5 '15 at 22:45
$begingroup$
You're absolutely right. It is my fault: I wanted to make the question as succint as possible, so I omitted the context. Let me add it now.
$endgroup$
– Tamas Ferenci
Jun 5 '15 at 22:45
$begingroup$
I aimed to come up with a very-very-very simple model for publication bias. You take a background distribution of $N(0,sigma^2)$ (the drug is not effective). You take a sample of size $n$ (trials of equal size performed), its average (metaanalysis) is also normal with zero mean. But - now comes the important part - you drop the minimum observation: the worst trial is not performed. What will be the distribution of your metaanalysis in this case? And this is practically a trimmed mean, that's why I asked it, but with asymmetric trimming.
$endgroup$
– Tamas Ferenci
Jun 5 '15 at 22:45
$begingroup$
I aimed to come up with a very-very-very simple model for publication bias. You take a background distribution of $N(0,sigma^2)$ (the drug is not effective). You take a sample of size $n$ (trials of equal size performed), its average (metaanalysis) is also normal with zero mean. But - now comes the important part - you drop the minimum observation: the worst trial is not performed. What will be the distribution of your metaanalysis in this case? And this is practically a trimmed mean, that's why I asked it, but with asymmetric trimming.
$endgroup$
– Tamas Ferenci
Jun 5 '15 at 22:45
|
show 3 more comments
Thanks for contributing an answer to Mathematics Stack Exchange!
- Please be sure to answer the question. Provide details and share your research!
But avoid …
- Asking for help, clarification, or responding to other answers.
- Making statements based on opinion; back them up with references or personal experience.
Use MathJax to format equations. MathJax reference.
To learn more, see our tips on writing great answers.
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fmath.stackexchange.com%2fquestions%2f1310518%2fsampling-distribution-of-sample-trimmed-truncated-mean%23new-answer', 'question_page');
}
);
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
$begingroup$
Tamás! What is the trimmed mean?
$endgroup$
– zoli
Jun 3 '15 at 12:41
2
$begingroup$
@zoli: You take the sample, drop some of the smallest and highest observations (say, the 10 smallest and 10 largest in a sample of 100 - 10% trimmed mean) and take the mean of the remaining (80 in the example) observations. It is sometimes also called truncated mean.
$endgroup$
– Tamas Ferenci
Jun 3 '15 at 12:53
$begingroup$
@zoli: See as much of my Answer as interests you.
$endgroup$
– BruceET
Jun 5 '15 at 3:20