Using Stats to provide a score for a daily rating with low sample size
up vote
0
down vote
favorite
Context
We have a register of people in our database, with a record of teams and leaders. Every day we send out a survey to each team asking them to rate their leader on whether they performed a specific duty. So for example we might have a team leader who has 5 team members reporting into that leader. Each day those 5 members receive a survey asking them if the leader performed a duty.
The reports may choose whether or not to respond to the survey, and so for any one day you may receive responses from all 5 reports, or from only 1 report, or from 0 reports.
The result from each report gives you a binary decision, 1 or 0, of whether the leader performed the duty on that day.
Moving beyond this example, each leader will have a different team size.
In addition, in each survey we can actually ask the team member whether the leader performed more than one duty. So we could ask the report about 1 duty, or 2 duties or 3 duties, or even more. Again, for each duty we have a true or false, 1 or 0, decision regarding whether the leader performed that duty, with a set of decisions for each report who answered the survey.
Problem
We want to be able to give a rating to a leader on how they performed for any one survey. Each leader has a team size, and for each survey only a certain proportion of that team size will actually respond. The response rate could be anywhere from 0% to 100%.
Ignoring the case for a 0% participation rate, how we do we give the leader a score for their performance for the day?
We would ideally like to use only the variables that are available for that day – so not using any historical data from past surveys, but instead only the current survey’s results.
We would ideally like to consider participation rate in the scoring somehow as well, so that the results from a day with high confidence (full participation rate) are not somehow at a disadvantage to a day with low confidence (low participation rate).
The score itself could take the form of a 0 to 1 or 0 to 100 scale, with a threshold applied to transform the score into a specific category e.g. “Good”, “Neutral”, “Bad”, or it could go directly to a categorical score.
In addition, the scores that people receive should be perceived as fair, as they will have an effect on motivation. We want the scores to be independent of other team leader's scores – meaning that we do not want to obtain some pre-specified distribution like grading on a curve (https://en.wikipedia.org/wiki/Grading_on_a_curve).
Options I've considered
There are some really obvious ones like:
- Number of Votes / Team Size (for each duty, and then average across all duties), then apply a threshold to the percentage - [possibly our best option right now, but participation is usually toward the lower end, meaning this would make it hard to score well without getting participation higher]
- Number of Votes / Number of Survey Participants (for each duty, and then average across all duties), then apply a threshold to the percentage - [this makes it easier to get 100 if less people participate in the survey]
- Number of Votes by itself with a threshold on the total count - [this disadvantages small teams, as they can get less possible votes]
Neither of which take into account all three variables at the same time. I have considered whether confidence intervals would be appropriate, but our team sizes vary from 1 person to about 30 - so I think the population size is too small to use those (correct me if I'm wrong). So Evan Miller's how not to sort by average rating I believe doesn't apply well in my case. In addition, these aren't ratings, but rather occurrences of particular events, which are independent of each other.
What am I missing here that would be a perfect fit?
probability analysis scoring-algorithm
New contributor
add a comment |
up vote
0
down vote
favorite
Context
We have a register of people in our database, with a record of teams and leaders. Every day we send out a survey to each team asking them to rate their leader on whether they performed a specific duty. So for example we might have a team leader who has 5 team members reporting into that leader. Each day those 5 members receive a survey asking them if the leader performed a duty.
The reports may choose whether or not to respond to the survey, and so for any one day you may receive responses from all 5 reports, or from only 1 report, or from 0 reports.
The result from each report gives you a binary decision, 1 or 0, of whether the leader performed the duty on that day.
Moving beyond this example, each leader will have a different team size.
In addition, in each survey we can actually ask the team member whether the leader performed more than one duty. So we could ask the report about 1 duty, or 2 duties or 3 duties, or even more. Again, for each duty we have a true or false, 1 or 0, decision regarding whether the leader performed that duty, with a set of decisions for each report who answered the survey.
Problem
We want to be able to give a rating to a leader on how they performed for any one survey. Each leader has a team size, and for each survey only a certain proportion of that team size will actually respond. The response rate could be anywhere from 0% to 100%.
Ignoring the case for a 0% participation rate, how we do we give the leader a score for their performance for the day?
We would ideally like to use only the variables that are available for that day – so not using any historical data from past surveys, but instead only the current survey’s results.
We would ideally like to consider participation rate in the scoring somehow as well, so that the results from a day with high confidence (full participation rate) are not somehow at a disadvantage to a day with low confidence (low participation rate).
The score itself could take the form of a 0 to 1 or 0 to 100 scale, with a threshold applied to transform the score into a specific category e.g. “Good”, “Neutral”, “Bad”, or it could go directly to a categorical score.
In addition, the scores that people receive should be perceived as fair, as they will have an effect on motivation. We want the scores to be independent of other team leader's scores – meaning that we do not want to obtain some pre-specified distribution like grading on a curve (https://en.wikipedia.org/wiki/Grading_on_a_curve).
Options I've considered
There are some really obvious ones like:
- Number of Votes / Team Size (for each duty, and then average across all duties), then apply a threshold to the percentage - [possibly our best option right now, but participation is usually toward the lower end, meaning this would make it hard to score well without getting participation higher]
- Number of Votes / Number of Survey Participants (for each duty, and then average across all duties), then apply a threshold to the percentage - [this makes it easier to get 100 if less people participate in the survey]
- Number of Votes by itself with a threshold on the total count - [this disadvantages small teams, as they can get less possible votes]
Neither of which take into account all three variables at the same time. I have considered whether confidence intervals would be appropriate, but our team sizes vary from 1 person to about 30 - so I think the population size is too small to use those (correct me if I'm wrong). So Evan Miller's how not to sort by average rating I believe doesn't apply well in my case. In addition, these aren't ratings, but rather occurrences of particular events, which are independent of each other.
What am I missing here that would be a perfect fit?
probability analysis scoring-algorithm
New contributor
add a comment |
up vote
0
down vote
favorite
up vote
0
down vote
favorite
Context
We have a register of people in our database, with a record of teams and leaders. Every day we send out a survey to each team asking them to rate their leader on whether they performed a specific duty. So for example we might have a team leader who has 5 team members reporting into that leader. Each day those 5 members receive a survey asking them if the leader performed a duty.
The reports may choose whether or not to respond to the survey, and so for any one day you may receive responses from all 5 reports, or from only 1 report, or from 0 reports.
The result from each report gives you a binary decision, 1 or 0, of whether the leader performed the duty on that day.
Moving beyond this example, each leader will have a different team size.
In addition, in each survey we can actually ask the team member whether the leader performed more than one duty. So we could ask the report about 1 duty, or 2 duties or 3 duties, or even more. Again, for each duty we have a true or false, 1 or 0, decision regarding whether the leader performed that duty, with a set of decisions for each report who answered the survey.
Problem
We want to be able to give a rating to a leader on how they performed for any one survey. Each leader has a team size, and for each survey only a certain proportion of that team size will actually respond. The response rate could be anywhere from 0% to 100%.
Ignoring the case for a 0% participation rate, how we do we give the leader a score for their performance for the day?
We would ideally like to use only the variables that are available for that day – so not using any historical data from past surveys, but instead only the current survey’s results.
We would ideally like to consider participation rate in the scoring somehow as well, so that the results from a day with high confidence (full participation rate) are not somehow at a disadvantage to a day with low confidence (low participation rate).
The score itself could take the form of a 0 to 1 or 0 to 100 scale, with a threshold applied to transform the score into a specific category e.g. “Good”, “Neutral”, “Bad”, or it could go directly to a categorical score.
In addition, the scores that people receive should be perceived as fair, as they will have an effect on motivation. We want the scores to be independent of other team leader's scores – meaning that we do not want to obtain some pre-specified distribution like grading on a curve (https://en.wikipedia.org/wiki/Grading_on_a_curve).
Options I've considered
There are some really obvious ones like:
- Number of Votes / Team Size (for each duty, and then average across all duties), then apply a threshold to the percentage - [possibly our best option right now, but participation is usually toward the lower end, meaning this would make it hard to score well without getting participation higher]
- Number of Votes / Number of Survey Participants (for each duty, and then average across all duties), then apply a threshold to the percentage - [this makes it easier to get 100 if less people participate in the survey]
- Number of Votes by itself with a threshold on the total count - [this disadvantages small teams, as they can get less possible votes]
Neither of which take into account all three variables at the same time. I have considered whether confidence intervals would be appropriate, but our team sizes vary from 1 person to about 30 - so I think the population size is too small to use those (correct me if I'm wrong). So Evan Miller's how not to sort by average rating I believe doesn't apply well in my case. In addition, these aren't ratings, but rather occurrences of particular events, which are independent of each other.
What am I missing here that would be a perfect fit?
probability analysis scoring-algorithm
New contributor
Context
We have a register of people in our database, with a record of teams and leaders. Every day we send out a survey to each team asking them to rate their leader on whether they performed a specific duty. So for example we might have a team leader who has 5 team members reporting into that leader. Each day those 5 members receive a survey asking them if the leader performed a duty.
The reports may choose whether or not to respond to the survey, and so for any one day you may receive responses from all 5 reports, or from only 1 report, or from 0 reports.
The result from each report gives you a binary decision, 1 or 0, of whether the leader performed the duty on that day.
Moving beyond this example, each leader will have a different team size.
In addition, in each survey we can actually ask the team member whether the leader performed more than one duty. So we could ask the report about 1 duty, or 2 duties or 3 duties, or even more. Again, for each duty we have a true or false, 1 or 0, decision regarding whether the leader performed that duty, with a set of decisions for each report who answered the survey.
Problem
We want to be able to give a rating to a leader on how they performed for any one survey. Each leader has a team size, and for each survey only a certain proportion of that team size will actually respond. The response rate could be anywhere from 0% to 100%.
Ignoring the case for a 0% participation rate, how we do we give the leader a score for their performance for the day?
We would ideally like to use only the variables that are available for that day – so not using any historical data from past surveys, but instead only the current survey’s results.
We would ideally like to consider participation rate in the scoring somehow as well, so that the results from a day with high confidence (full participation rate) are not somehow at a disadvantage to a day with low confidence (low participation rate).
The score itself could take the form of a 0 to 1 or 0 to 100 scale, with a threshold applied to transform the score into a specific category e.g. “Good”, “Neutral”, “Bad”, or it could go directly to a categorical score.
In addition, the scores that people receive should be perceived as fair, as they will have an effect on motivation. We want the scores to be independent of other team leader's scores – meaning that we do not want to obtain some pre-specified distribution like grading on a curve (https://en.wikipedia.org/wiki/Grading_on_a_curve).
Options I've considered
There are some really obvious ones like:
- Number of Votes / Team Size (for each duty, and then average across all duties), then apply a threshold to the percentage - [possibly our best option right now, but participation is usually toward the lower end, meaning this would make it hard to score well without getting participation higher]
- Number of Votes / Number of Survey Participants (for each duty, and then average across all duties), then apply a threshold to the percentage - [this makes it easier to get 100 if less people participate in the survey]
- Number of Votes by itself with a threshold on the total count - [this disadvantages small teams, as they can get less possible votes]
Neither of which take into account all three variables at the same time. I have considered whether confidence intervals would be appropriate, but our team sizes vary from 1 person to about 30 - so I think the population size is too small to use those (correct me if I'm wrong). So Evan Miller's how not to sort by average rating I believe doesn't apply well in my case. In addition, these aren't ratings, but rather occurrences of particular events, which are independent of each other.
What am I missing here that would be a perfect fit?
probability analysis scoring-algorithm
probability analysis scoring-algorithm
New contributor
New contributor
New contributor
asked 15 hours ago
tastychocolatemilk
1
1
New contributor
New contributor
add a comment |
add a comment |
active
oldest
votes
active
oldest
votes
active
oldest
votes
active
oldest
votes
active
oldest
votes
tastychocolatemilk is a new contributor. Be nice, and check out our Code of Conduct.
tastychocolatemilk is a new contributor. Be nice, and check out our Code of Conduct.
tastychocolatemilk is a new contributor. Be nice, and check out our Code of Conduct.
tastychocolatemilk is a new contributor. Be nice, and check out our Code of Conduct.
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fmath.stackexchange.com%2fquestions%2f3004729%2fusing-stats-to-provide-a-score-for-a-daily-rating-with-low-sample-size%23new-answer', 'question_page');
}
);
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown