count distinct of text field in data studio increased when using filter












0















I want to count the number of sessions with a certain event label in google data studio. I have created a new field in data studio on a google analytics source like this:



COUNT_DISTINCT(CASE WHEN Event Label = "Form Start" THEN Session ID ELSE "" END)


where Session ID is a custom dimension from GA (string).
The problem is that when I for example pull this new metric to a scorecard, I get a value of 6, if I then add a filter on this scorecard with Event Label = "Form Start" (the exact same event label as in the case statement of the new field) the metric is increased to 23! (which is the correct number).



Is there some data truncation going on in data studio behind the scenes or why does using the filter increase the distinct count?










share|improve this question



























    0















    I want to count the number of sessions with a certain event label in google data studio. I have created a new field in data studio on a google analytics source like this:



    COUNT_DISTINCT(CASE WHEN Event Label = "Form Start" THEN Session ID ELSE "" END)


    where Session ID is a custom dimension from GA (string).
    The problem is that when I for example pull this new metric to a scorecard, I get a value of 6, if I then add a filter on this scorecard with Event Label = "Form Start" (the exact same event label as in the case statement of the new field) the metric is increased to 23! (which is the correct number).



    Is there some data truncation going on in data studio behind the scenes or why does using the filter increase the distinct count?










    share|improve this question

























      0












      0








      0








      I want to count the number of sessions with a certain event label in google data studio. I have created a new field in data studio on a google analytics source like this:



      COUNT_DISTINCT(CASE WHEN Event Label = "Form Start" THEN Session ID ELSE "" END)


      where Session ID is a custom dimension from GA (string).
      The problem is that when I for example pull this new metric to a scorecard, I get a value of 6, if I then add a filter on this scorecard with Event Label = "Form Start" (the exact same event label as in the case statement of the new field) the metric is increased to 23! (which is the correct number).



      Is there some data truncation going on in data studio behind the scenes or why does using the filter increase the distinct count?










      share|improve this question














      I want to count the number of sessions with a certain event label in google data studio. I have created a new field in data studio on a google analytics source like this:



      COUNT_DISTINCT(CASE WHEN Event Label = "Form Start" THEN Session ID ELSE "" END)


      where Session ID is a custom dimension from GA (string).
      The problem is that when I for example pull this new metric to a scorecard, I get a value of 6, if I then add a filter on this scorecard with Event Label = "Form Start" (the exact same event label as in the case statement of the new field) the metric is increased to 23! (which is the correct number).



      Is there some data truncation going on in data studio behind the scenes or why does using the filter increase the distinct count?







      google-analytics google-data-studio






      share|improve this question













      share|improve this question











      share|improve this question




      share|improve this question










      asked Jan 2 at 10:51









      IchtaIchta

      13517




      13517
























          2 Answers
          2






          active

          oldest

          votes


















          0














          The weird numbers you're seeing could be due to sampling. At the bottom of the report in "view" mode, it should indicate if the numbers are sampled or not.



          Also, the Unique Events metric should tell you the number of times a specific event happened per session. You might not need to do all that custom work in data studio, just a filter for the label.






          share|improve this answer
























          • Ok, I can't see any information about sampling in the report but it would explain the problem. Is there a way to remove sampling? I'm trying to create custom conversion rates, that's why I need the calculated metric (following this: datarunsdeep.com.au/blog/…)

            – Ichta
            Jan 2 at 16:36













          • You can use a shorter date range. Otherwise, you’ll need to pay for GA360

            – XTOTHEL
            Jan 2 at 16:37











          • we have GA360 already

            – Ichta
            Jan 2 at 16:38











          • in that case you shouldn't have any sampling issues, unless you're pulling hundreds of million worth of sessions.

            – XTOTHEL
            Jan 2 at 16:47











          • The sesssions are around 500 000 and there is no problem counting the unique number of session ids. The problem seems to arise when mixing in the event labels (which are on hit level) which increases the detail level the calculation has to go through. I just find it strange that there is no problem filtering on event label but only to use it in a case statement.

            – Ichta
            Jan 3 at 8:40



















          0














          I might be missing something that requires the COUNT_DISTINCT function, but would a simpler, different formula work?



          CASE WHEN Event Label = "Form Start" THEN 1 ELSE 0 END

          This would create a number field that that can be used in the metric element of a score card with multiple aggregation options? The key options being SUM :)




          share|improve this answer
























          • Yes, I wish that would work, tried it already but the sum of that statement actually always returns 1 :/

            – Ichta
            Jan 10 at 11:46











          Your Answer






          StackExchange.ifUsing("editor", function () {
          StackExchange.using("externalEditor", function () {
          StackExchange.using("snippets", function () {
          StackExchange.snippets.init();
          });
          });
          }, "code-snippets");

          StackExchange.ready(function() {
          var channelOptions = {
          tags: "".split(" "),
          id: "1"
          };
          initTagRenderer("".split(" "), "".split(" "), channelOptions);

          StackExchange.using("externalEditor", function() {
          // Have to fire editor after snippets, if snippets enabled
          if (StackExchange.settings.snippets.snippetsEnabled) {
          StackExchange.using("snippets", function() {
          createEditor();
          });
          }
          else {
          createEditor();
          }
          });

          function createEditor() {
          StackExchange.prepareEditor({
          heartbeatType: 'answer',
          autoActivateHeartbeat: false,
          convertImagesToLinks: true,
          noModals: true,
          showLowRepImageUploadWarning: true,
          reputationToPostImages: 10,
          bindNavPrevention: true,
          postfix: "",
          imageUploader: {
          brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
          contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
          allowUrls: true
          },
          onDemand: true,
          discardSelector: ".discard-answer"
          ,immediatelyShowMarkdownHelp:true
          });


          }
          });














          draft saved

          draft discarded


















          StackExchange.ready(
          function () {
          StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f54005017%2fcount-distinct-of-text-field-in-data-studio-increased-when-using-filter%23new-answer', 'question_page');
          }
          );

          Post as a guest















          Required, but never shown

























          2 Answers
          2






          active

          oldest

          votes








          2 Answers
          2






          active

          oldest

          votes









          active

          oldest

          votes






          active

          oldest

          votes









          0














          The weird numbers you're seeing could be due to sampling. At the bottom of the report in "view" mode, it should indicate if the numbers are sampled or not.



          Also, the Unique Events metric should tell you the number of times a specific event happened per session. You might not need to do all that custom work in data studio, just a filter for the label.






          share|improve this answer
























          • Ok, I can't see any information about sampling in the report but it would explain the problem. Is there a way to remove sampling? I'm trying to create custom conversion rates, that's why I need the calculated metric (following this: datarunsdeep.com.au/blog/…)

            – Ichta
            Jan 2 at 16:36













          • You can use a shorter date range. Otherwise, you’ll need to pay for GA360

            – XTOTHEL
            Jan 2 at 16:37











          • we have GA360 already

            – Ichta
            Jan 2 at 16:38











          • in that case you shouldn't have any sampling issues, unless you're pulling hundreds of million worth of sessions.

            – XTOTHEL
            Jan 2 at 16:47











          • The sesssions are around 500 000 and there is no problem counting the unique number of session ids. The problem seems to arise when mixing in the event labels (which are on hit level) which increases the detail level the calculation has to go through. I just find it strange that there is no problem filtering on event label but only to use it in a case statement.

            – Ichta
            Jan 3 at 8:40
















          0














          The weird numbers you're seeing could be due to sampling. At the bottom of the report in "view" mode, it should indicate if the numbers are sampled or not.



          Also, the Unique Events metric should tell you the number of times a specific event happened per session. You might not need to do all that custom work in data studio, just a filter for the label.






          share|improve this answer
























          • Ok, I can't see any information about sampling in the report but it would explain the problem. Is there a way to remove sampling? I'm trying to create custom conversion rates, that's why I need the calculated metric (following this: datarunsdeep.com.au/blog/…)

            – Ichta
            Jan 2 at 16:36













          • You can use a shorter date range. Otherwise, you’ll need to pay for GA360

            – XTOTHEL
            Jan 2 at 16:37











          • we have GA360 already

            – Ichta
            Jan 2 at 16:38











          • in that case you shouldn't have any sampling issues, unless you're pulling hundreds of million worth of sessions.

            – XTOTHEL
            Jan 2 at 16:47











          • The sesssions are around 500 000 and there is no problem counting the unique number of session ids. The problem seems to arise when mixing in the event labels (which are on hit level) which increases the detail level the calculation has to go through. I just find it strange that there is no problem filtering on event label but only to use it in a case statement.

            – Ichta
            Jan 3 at 8:40














          0












          0








          0







          The weird numbers you're seeing could be due to sampling. At the bottom of the report in "view" mode, it should indicate if the numbers are sampled or not.



          Also, the Unique Events metric should tell you the number of times a specific event happened per session. You might not need to do all that custom work in data studio, just a filter for the label.






          share|improve this answer













          The weird numbers you're seeing could be due to sampling. At the bottom of the report in "view" mode, it should indicate if the numbers are sampled or not.



          Also, the Unique Events metric should tell you the number of times a specific event happened per session. You might not need to do all that custom work in data studio, just a filter for the label.







          share|improve this answer












          share|improve this answer



          share|improve this answer










          answered Jan 2 at 16:24









          XTOTHELXTOTHEL

          2,4511311




          2,4511311













          • Ok, I can't see any information about sampling in the report but it would explain the problem. Is there a way to remove sampling? I'm trying to create custom conversion rates, that's why I need the calculated metric (following this: datarunsdeep.com.au/blog/…)

            – Ichta
            Jan 2 at 16:36













          • You can use a shorter date range. Otherwise, you’ll need to pay for GA360

            – XTOTHEL
            Jan 2 at 16:37











          • we have GA360 already

            – Ichta
            Jan 2 at 16:38











          • in that case you shouldn't have any sampling issues, unless you're pulling hundreds of million worth of sessions.

            – XTOTHEL
            Jan 2 at 16:47











          • The sesssions are around 500 000 and there is no problem counting the unique number of session ids. The problem seems to arise when mixing in the event labels (which are on hit level) which increases the detail level the calculation has to go through. I just find it strange that there is no problem filtering on event label but only to use it in a case statement.

            – Ichta
            Jan 3 at 8:40



















          • Ok, I can't see any information about sampling in the report but it would explain the problem. Is there a way to remove sampling? I'm trying to create custom conversion rates, that's why I need the calculated metric (following this: datarunsdeep.com.au/blog/…)

            – Ichta
            Jan 2 at 16:36













          • You can use a shorter date range. Otherwise, you’ll need to pay for GA360

            – XTOTHEL
            Jan 2 at 16:37











          • we have GA360 already

            – Ichta
            Jan 2 at 16:38











          • in that case you shouldn't have any sampling issues, unless you're pulling hundreds of million worth of sessions.

            – XTOTHEL
            Jan 2 at 16:47











          • The sesssions are around 500 000 and there is no problem counting the unique number of session ids. The problem seems to arise when mixing in the event labels (which are on hit level) which increases the detail level the calculation has to go through. I just find it strange that there is no problem filtering on event label but only to use it in a case statement.

            – Ichta
            Jan 3 at 8:40

















          Ok, I can't see any information about sampling in the report but it would explain the problem. Is there a way to remove sampling? I'm trying to create custom conversion rates, that's why I need the calculated metric (following this: datarunsdeep.com.au/blog/…)

          – Ichta
          Jan 2 at 16:36







          Ok, I can't see any information about sampling in the report but it would explain the problem. Is there a way to remove sampling? I'm trying to create custom conversion rates, that's why I need the calculated metric (following this: datarunsdeep.com.au/blog/…)

          – Ichta
          Jan 2 at 16:36















          You can use a shorter date range. Otherwise, you’ll need to pay for GA360

          – XTOTHEL
          Jan 2 at 16:37





          You can use a shorter date range. Otherwise, you’ll need to pay for GA360

          – XTOTHEL
          Jan 2 at 16:37













          we have GA360 already

          – Ichta
          Jan 2 at 16:38





          we have GA360 already

          – Ichta
          Jan 2 at 16:38













          in that case you shouldn't have any sampling issues, unless you're pulling hundreds of million worth of sessions.

          – XTOTHEL
          Jan 2 at 16:47





          in that case you shouldn't have any sampling issues, unless you're pulling hundreds of million worth of sessions.

          – XTOTHEL
          Jan 2 at 16:47













          The sesssions are around 500 000 and there is no problem counting the unique number of session ids. The problem seems to arise when mixing in the event labels (which are on hit level) which increases the detail level the calculation has to go through. I just find it strange that there is no problem filtering on event label but only to use it in a case statement.

          – Ichta
          Jan 3 at 8:40





          The sesssions are around 500 000 and there is no problem counting the unique number of session ids. The problem seems to arise when mixing in the event labels (which are on hit level) which increases the detail level the calculation has to go through. I just find it strange that there is no problem filtering on event label but only to use it in a case statement.

          – Ichta
          Jan 3 at 8:40













          0














          I might be missing something that requires the COUNT_DISTINCT function, but would a simpler, different formula work?



          CASE WHEN Event Label = "Form Start" THEN 1 ELSE 0 END

          This would create a number field that that can be used in the metric element of a score card with multiple aggregation options? The key options being SUM :)




          share|improve this answer
























          • Yes, I wish that would work, tried it already but the sum of that statement actually always returns 1 :/

            – Ichta
            Jan 10 at 11:46
















          0














          I might be missing something that requires the COUNT_DISTINCT function, but would a simpler, different formula work?



          CASE WHEN Event Label = "Form Start" THEN 1 ELSE 0 END

          This would create a number field that that can be used in the metric element of a score card with multiple aggregation options? The key options being SUM :)




          share|improve this answer
























          • Yes, I wish that would work, tried it already but the sum of that statement actually always returns 1 :/

            – Ichta
            Jan 10 at 11:46














          0












          0








          0







          I might be missing something that requires the COUNT_DISTINCT function, but would a simpler, different formula work?



          CASE WHEN Event Label = "Form Start" THEN 1 ELSE 0 END

          This would create a number field that that can be used in the metric element of a score card with multiple aggregation options? The key options being SUM :)




          share|improve this answer













          I might be missing something that requires the COUNT_DISTINCT function, but would a simpler, different formula work?



          CASE WHEN Event Label = "Form Start" THEN 1 ELSE 0 END

          This would create a number field that that can be used in the metric element of a score card with multiple aggregation options? The key options being SUM :)





          share|improve this answer












          share|improve this answer



          share|improve this answer










          answered Jan 8 at 22:39









          Pete MontgomeryPete Montgomery

          1




          1













          • Yes, I wish that would work, tried it already but the sum of that statement actually always returns 1 :/

            – Ichta
            Jan 10 at 11:46



















          • Yes, I wish that would work, tried it already but the sum of that statement actually always returns 1 :/

            – Ichta
            Jan 10 at 11:46

















          Yes, I wish that would work, tried it already but the sum of that statement actually always returns 1 :/

          – Ichta
          Jan 10 at 11:46





          Yes, I wish that would work, tried it already but the sum of that statement actually always returns 1 :/

          – Ichta
          Jan 10 at 11:46


















          draft saved

          draft discarded




















































          Thanks for contributing an answer to Stack Overflow!


          • Please be sure to answer the question. Provide details and share your research!

          But avoid



          • Asking for help, clarification, or responding to other answers.

          • Making statements based on opinion; back them up with references or personal experience.


          To learn more, see our tips on writing great answers.




          draft saved


          draft discarded














          StackExchange.ready(
          function () {
          StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f54005017%2fcount-distinct-of-text-field-in-data-studio-increased-when-using-filter%23new-answer', 'question_page');
          }
          );

          Post as a guest















          Required, but never shown





















































          Required, but never shown














          Required, but never shown












          Required, but never shown







          Required, but never shown

































          Required, but never shown














          Required, but never shown












          Required, but never shown







          Required, but never shown







          Popular posts from this blog

          MongoDB - Not Authorized To Execute Command

          How to fix TextFormField cause rebuild widget in Flutter

          Npm cannot find a required file even through it is in the searched directory