count distinct of text field in data studio increased when using filter
I want to count the number of sessions with a certain event label in google data studio. I have created a new field in data studio on a google analytics source like this:
COUNT_DISTINCT(CASE WHEN Event Label = "Form Start" THEN Session ID ELSE "" END)
where Session ID is a custom dimension from GA (string).
The problem is that when I for example pull this new metric to a scorecard, I get a value of 6, if I then add a filter on this scorecard with Event Label = "Form Start" (the exact same event label as in the case statement of the new field) the metric is increased to 23! (which is the correct number).
Is there some data truncation going on in data studio behind the scenes or why does using the filter increase the distinct count?
google-analytics google-data-studio
add a comment |
I want to count the number of sessions with a certain event label in google data studio. I have created a new field in data studio on a google analytics source like this:
COUNT_DISTINCT(CASE WHEN Event Label = "Form Start" THEN Session ID ELSE "" END)
where Session ID is a custom dimension from GA (string).
The problem is that when I for example pull this new metric to a scorecard, I get a value of 6, if I then add a filter on this scorecard with Event Label = "Form Start" (the exact same event label as in the case statement of the new field) the metric is increased to 23! (which is the correct number).
Is there some data truncation going on in data studio behind the scenes or why does using the filter increase the distinct count?
google-analytics google-data-studio
add a comment |
I want to count the number of sessions with a certain event label in google data studio. I have created a new field in data studio on a google analytics source like this:
COUNT_DISTINCT(CASE WHEN Event Label = "Form Start" THEN Session ID ELSE "" END)
where Session ID is a custom dimension from GA (string).
The problem is that when I for example pull this new metric to a scorecard, I get a value of 6, if I then add a filter on this scorecard with Event Label = "Form Start" (the exact same event label as in the case statement of the new field) the metric is increased to 23! (which is the correct number).
Is there some data truncation going on in data studio behind the scenes or why does using the filter increase the distinct count?
google-analytics google-data-studio
I want to count the number of sessions with a certain event label in google data studio. I have created a new field in data studio on a google analytics source like this:
COUNT_DISTINCT(CASE WHEN Event Label = "Form Start" THEN Session ID ELSE "" END)
where Session ID is a custom dimension from GA (string).
The problem is that when I for example pull this new metric to a scorecard, I get a value of 6, if I then add a filter on this scorecard with Event Label = "Form Start" (the exact same event label as in the case statement of the new field) the metric is increased to 23! (which is the correct number).
Is there some data truncation going on in data studio behind the scenes or why does using the filter increase the distinct count?
google-analytics google-data-studio
google-analytics google-data-studio
asked Jan 2 at 10:51
IchtaIchta
13517
13517
add a comment |
add a comment |
2 Answers
2
active
oldest
votes
The weird numbers you're seeing could be due to sampling. At the bottom of the report in "view" mode, it should indicate if the numbers are sampled or not.
Also, the Unique Events metric should tell you the number of times a specific event happened per session. You might not need to do all that custom work in data studio, just a filter for the label.
Ok, I can't see any information about sampling in the report but it would explain the problem. Is there a way to remove sampling? I'm trying to create custom conversion rates, that's why I need the calculated metric (following this: datarunsdeep.com.au/blog/…)
– Ichta
Jan 2 at 16:36
You can use a shorter date range. Otherwise, you’ll need to pay for GA360
– XTOTHEL
Jan 2 at 16:37
we have GA360 already
– Ichta
Jan 2 at 16:38
in that case you shouldn't have any sampling issues, unless you're pulling hundreds of million worth of sessions.
– XTOTHEL
Jan 2 at 16:47
The sesssions are around 500 000 and there is no problem counting the unique number of session ids. The problem seems to arise when mixing in the event labels (which are on hit level) which increases the detail level the calculation has to go through. I just find it strange that there is no problem filtering on event label but only to use it in a case statement.
– Ichta
Jan 3 at 8:40
|
show 2 more comments
I might be missing something that requires the COUNT_DISTINCT function, but would a simpler, different formula work?
CASE WHEN Event Label = "Form Start" THEN 1 ELSE 0 END
This would create a number field that that can be used in the metric element of a score card with multiple aggregation options? The key options being SUM :)
Yes, I wish that would work, tried it already but the sum of that statement actually always returns 1 :/
– Ichta
Jan 10 at 11:46
add a comment |
Your Answer
StackExchange.ifUsing("editor", function () {
StackExchange.using("externalEditor", function () {
StackExchange.using("snippets", function () {
StackExchange.snippets.init();
});
});
}, "code-snippets");
StackExchange.ready(function() {
var channelOptions = {
tags: "".split(" "),
id: "1"
};
initTagRenderer("".split(" "), "".split(" "), channelOptions);
StackExchange.using("externalEditor", function() {
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled) {
StackExchange.using("snippets", function() {
createEditor();
});
}
else {
createEditor();
}
});
function createEditor() {
StackExchange.prepareEditor({
heartbeatType: 'answer',
autoActivateHeartbeat: false,
convertImagesToLinks: true,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: 10,
bindNavPrevention: true,
postfix: "",
imageUploader: {
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
},
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
});
}
});
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f54005017%2fcount-distinct-of-text-field-in-data-studio-increased-when-using-filter%23new-answer', 'question_page');
}
);
Post as a guest
Required, but never shown
2 Answers
2
active
oldest
votes
2 Answers
2
active
oldest
votes
active
oldest
votes
active
oldest
votes
The weird numbers you're seeing could be due to sampling. At the bottom of the report in "view" mode, it should indicate if the numbers are sampled or not.
Also, the Unique Events metric should tell you the number of times a specific event happened per session. You might not need to do all that custom work in data studio, just a filter for the label.
Ok, I can't see any information about sampling in the report but it would explain the problem. Is there a way to remove sampling? I'm trying to create custom conversion rates, that's why I need the calculated metric (following this: datarunsdeep.com.au/blog/…)
– Ichta
Jan 2 at 16:36
You can use a shorter date range. Otherwise, you’ll need to pay for GA360
– XTOTHEL
Jan 2 at 16:37
we have GA360 already
– Ichta
Jan 2 at 16:38
in that case you shouldn't have any sampling issues, unless you're pulling hundreds of million worth of sessions.
– XTOTHEL
Jan 2 at 16:47
The sesssions are around 500 000 and there is no problem counting the unique number of session ids. The problem seems to arise when mixing in the event labels (which are on hit level) which increases the detail level the calculation has to go through. I just find it strange that there is no problem filtering on event label but only to use it in a case statement.
– Ichta
Jan 3 at 8:40
|
show 2 more comments
The weird numbers you're seeing could be due to sampling. At the bottom of the report in "view" mode, it should indicate if the numbers are sampled or not.
Also, the Unique Events metric should tell you the number of times a specific event happened per session. You might not need to do all that custom work in data studio, just a filter for the label.
Ok, I can't see any information about sampling in the report but it would explain the problem. Is there a way to remove sampling? I'm trying to create custom conversion rates, that's why I need the calculated metric (following this: datarunsdeep.com.au/blog/…)
– Ichta
Jan 2 at 16:36
You can use a shorter date range. Otherwise, you’ll need to pay for GA360
– XTOTHEL
Jan 2 at 16:37
we have GA360 already
– Ichta
Jan 2 at 16:38
in that case you shouldn't have any sampling issues, unless you're pulling hundreds of million worth of sessions.
– XTOTHEL
Jan 2 at 16:47
The sesssions are around 500 000 and there is no problem counting the unique number of session ids. The problem seems to arise when mixing in the event labels (which are on hit level) which increases the detail level the calculation has to go through. I just find it strange that there is no problem filtering on event label but only to use it in a case statement.
– Ichta
Jan 3 at 8:40
|
show 2 more comments
The weird numbers you're seeing could be due to sampling. At the bottom of the report in "view" mode, it should indicate if the numbers are sampled or not.
Also, the Unique Events metric should tell you the number of times a specific event happened per session. You might not need to do all that custom work in data studio, just a filter for the label.
The weird numbers you're seeing could be due to sampling. At the bottom of the report in "view" mode, it should indicate if the numbers are sampled or not.
Also, the Unique Events metric should tell you the number of times a specific event happened per session. You might not need to do all that custom work in data studio, just a filter for the label.
answered Jan 2 at 16:24
XTOTHELXTOTHEL
2,4511311
2,4511311
Ok, I can't see any information about sampling in the report but it would explain the problem. Is there a way to remove sampling? I'm trying to create custom conversion rates, that's why I need the calculated metric (following this: datarunsdeep.com.au/blog/…)
– Ichta
Jan 2 at 16:36
You can use a shorter date range. Otherwise, you’ll need to pay for GA360
– XTOTHEL
Jan 2 at 16:37
we have GA360 already
– Ichta
Jan 2 at 16:38
in that case you shouldn't have any sampling issues, unless you're pulling hundreds of million worth of sessions.
– XTOTHEL
Jan 2 at 16:47
The sesssions are around 500 000 and there is no problem counting the unique number of session ids. The problem seems to arise when mixing in the event labels (which are on hit level) which increases the detail level the calculation has to go through. I just find it strange that there is no problem filtering on event label but only to use it in a case statement.
– Ichta
Jan 3 at 8:40
|
show 2 more comments
Ok, I can't see any information about sampling in the report but it would explain the problem. Is there a way to remove sampling? I'm trying to create custom conversion rates, that's why I need the calculated metric (following this: datarunsdeep.com.au/blog/…)
– Ichta
Jan 2 at 16:36
You can use a shorter date range. Otherwise, you’ll need to pay for GA360
– XTOTHEL
Jan 2 at 16:37
we have GA360 already
– Ichta
Jan 2 at 16:38
in that case you shouldn't have any sampling issues, unless you're pulling hundreds of million worth of sessions.
– XTOTHEL
Jan 2 at 16:47
The sesssions are around 500 000 and there is no problem counting the unique number of session ids. The problem seems to arise when mixing in the event labels (which are on hit level) which increases the detail level the calculation has to go through. I just find it strange that there is no problem filtering on event label but only to use it in a case statement.
– Ichta
Jan 3 at 8:40
Ok, I can't see any information about sampling in the report but it would explain the problem. Is there a way to remove sampling? I'm trying to create custom conversion rates, that's why I need the calculated metric (following this: datarunsdeep.com.au/blog/…)
– Ichta
Jan 2 at 16:36
Ok, I can't see any information about sampling in the report but it would explain the problem. Is there a way to remove sampling? I'm trying to create custom conversion rates, that's why I need the calculated metric (following this: datarunsdeep.com.au/blog/…)
– Ichta
Jan 2 at 16:36
You can use a shorter date range. Otherwise, you’ll need to pay for GA360
– XTOTHEL
Jan 2 at 16:37
You can use a shorter date range. Otherwise, you’ll need to pay for GA360
– XTOTHEL
Jan 2 at 16:37
we have GA360 already
– Ichta
Jan 2 at 16:38
we have GA360 already
– Ichta
Jan 2 at 16:38
in that case you shouldn't have any sampling issues, unless you're pulling hundreds of million worth of sessions.
– XTOTHEL
Jan 2 at 16:47
in that case you shouldn't have any sampling issues, unless you're pulling hundreds of million worth of sessions.
– XTOTHEL
Jan 2 at 16:47
The sesssions are around 500 000 and there is no problem counting the unique number of session ids. The problem seems to arise when mixing in the event labels (which are on hit level) which increases the detail level the calculation has to go through. I just find it strange that there is no problem filtering on event label but only to use it in a case statement.
– Ichta
Jan 3 at 8:40
The sesssions are around 500 000 and there is no problem counting the unique number of session ids. The problem seems to arise when mixing in the event labels (which are on hit level) which increases the detail level the calculation has to go through. I just find it strange that there is no problem filtering on event label but only to use it in a case statement.
– Ichta
Jan 3 at 8:40
|
show 2 more comments
I might be missing something that requires the COUNT_DISTINCT function, but would a simpler, different formula work?
CASE WHEN Event Label = "Form Start" THEN 1 ELSE 0 END
This would create a number field that that can be used in the metric element of a score card with multiple aggregation options? The key options being SUM :)
Yes, I wish that would work, tried it already but the sum of that statement actually always returns 1 :/
– Ichta
Jan 10 at 11:46
add a comment |
I might be missing something that requires the COUNT_DISTINCT function, but would a simpler, different formula work?
CASE WHEN Event Label = "Form Start" THEN 1 ELSE 0 END
This would create a number field that that can be used in the metric element of a score card with multiple aggregation options? The key options being SUM :)
Yes, I wish that would work, tried it already but the sum of that statement actually always returns 1 :/
– Ichta
Jan 10 at 11:46
add a comment |
I might be missing something that requires the COUNT_DISTINCT function, but would a simpler, different formula work?
CASE WHEN Event Label = "Form Start" THEN 1 ELSE 0 END
This would create a number field that that can be used in the metric element of a score card with multiple aggregation options? The key options being SUM :)
I might be missing something that requires the COUNT_DISTINCT function, but would a simpler, different formula work?
CASE WHEN Event Label = "Form Start" THEN 1 ELSE 0 END
This would create a number field that that can be used in the metric element of a score card with multiple aggregation options? The key options being SUM :)
answered Jan 8 at 22:39


Pete MontgomeryPete Montgomery
1
1
Yes, I wish that would work, tried it already but the sum of that statement actually always returns 1 :/
– Ichta
Jan 10 at 11:46
add a comment |
Yes, I wish that would work, tried it already but the sum of that statement actually always returns 1 :/
– Ichta
Jan 10 at 11:46
Yes, I wish that would work, tried it already but the sum of that statement actually always returns 1 :/
– Ichta
Jan 10 at 11:46
Yes, I wish that would work, tried it already but the sum of that statement actually always returns 1 :/
– Ichta
Jan 10 at 11:46
add a comment |
Thanks for contributing an answer to Stack Overflow!
- Please be sure to answer the question. Provide details and share your research!
But avoid …
- Asking for help, clarification, or responding to other answers.
- Making statements based on opinion; back them up with references or personal experience.
To learn more, see our tips on writing great answers.
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f54005017%2fcount-distinct-of-text-field-in-data-studio-increased-when-using-filter%23new-answer', 'question_page');
}
);
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown