Is there a way to group timestamp data by 30 day intervals starting from the min(date) and add them as...





.everyoneloves__top-leaderboard:empty,.everyoneloves__mid-leaderboard:empty,.everyoneloves__bot-mid-leaderboard:empty{ height:90px;width:728px;box-sizing:border-box;
}







2















I am trying to use the min() value of a timestamp as a starting point and then group data by 30 day intervals in order to get a count of occurrences for each unique value within the timestamp date range as columns



i have two tables that i am joining together to get a count. Table 1 (page_creation) has 2 columns labeled link and dt_crtd. Table 2(page visits) has 2 other columns labeled url and date. the tables are being joined by joining table1.link = table2.pagevisits.



After the join i get a table similar to this:



+-------------------+------------------------+
| url | date |
+-------------------+------------------------+
| www.google.com | 2018-01-01 00:00:00' |
| www.google.com | 2018-01-02 00:00:00' |
| www.google.com | 2018-02-01 00:00:00' |
| www.google.com | 2018-02-05 00:00:00' |
| www.google.com | 2018-03-04 00:00:00' |
| www.facebook.com | 2014-01-05 00:00:00' |
| www.facebook.com | 2014-01-07 00:00:00' |
| www.facebook.com | 2014-04-02 00:00:00' |
| www.facebook.com | 2014-04-10 00:00:00' |
| www.facebook.com | 2014-04-11 00:00:00' |
| www.facebook.com | 2014-05-01 00:00:00' |
| www.twitter.com | 2016-02-01 00:00:00' |
| www.twitter.com | 2016-03-04 00:00:00' |
+---------------------+----------------------+


what i am trying to get is results that pull this :



+-------------------+------------------------+------------+------------+-------------+
| url | MIN_Date | Interval 1 | Interval 2| Interval 3 |
+-------------------+------------------------+-------------+-----------+-------------+
| www.google.com | 2018-01-01 00:00:00' | 2 | 2 | 1
| www.facebook.com | 2014-01-05 00:00:00' | 2 | 0 | 1
| www.twitter.com | 2016-02-01 00:00:00' | 1 | 1 | 0
+---------------------+----------------------+-------------+-----------+-------------+


So the 30 day intervals begin from the min(date) as shown in Interval 1 and are counted every 30 days.



Ive looked at other questions such as :



Group rows by 7 days interval starting from a certain date



MySQL query to select min datetime grouped by 30 day intervals



However it did not seem to answer my specific problem.



Ive also looked into pivot syntax but noticed it is only supported for certain DBMS.



Any help would be greatly appreciated.



Thank you.










share|improve this question























  • are you looking to get a variable number of columns, depending on how many 30 day intervals are found?

    – Felipe Hoffa
    Jan 3 at 7:03


















2















I am trying to use the min() value of a timestamp as a starting point and then group data by 30 day intervals in order to get a count of occurrences for each unique value within the timestamp date range as columns



i have two tables that i am joining together to get a count. Table 1 (page_creation) has 2 columns labeled link and dt_crtd. Table 2(page visits) has 2 other columns labeled url and date. the tables are being joined by joining table1.link = table2.pagevisits.



After the join i get a table similar to this:



+-------------------+------------------------+
| url | date |
+-------------------+------------------------+
| www.google.com | 2018-01-01 00:00:00' |
| www.google.com | 2018-01-02 00:00:00' |
| www.google.com | 2018-02-01 00:00:00' |
| www.google.com | 2018-02-05 00:00:00' |
| www.google.com | 2018-03-04 00:00:00' |
| www.facebook.com | 2014-01-05 00:00:00' |
| www.facebook.com | 2014-01-07 00:00:00' |
| www.facebook.com | 2014-04-02 00:00:00' |
| www.facebook.com | 2014-04-10 00:00:00' |
| www.facebook.com | 2014-04-11 00:00:00' |
| www.facebook.com | 2014-05-01 00:00:00' |
| www.twitter.com | 2016-02-01 00:00:00' |
| www.twitter.com | 2016-03-04 00:00:00' |
+---------------------+----------------------+


what i am trying to get is results that pull this :



+-------------------+------------------------+------------+------------+-------------+
| url | MIN_Date | Interval 1 | Interval 2| Interval 3 |
+-------------------+------------------------+-------------+-----------+-------------+
| www.google.com | 2018-01-01 00:00:00' | 2 | 2 | 1
| www.facebook.com | 2014-01-05 00:00:00' | 2 | 0 | 1
| www.twitter.com | 2016-02-01 00:00:00' | 1 | 1 | 0
+---------------------+----------------------+-------------+-----------+-------------+


So the 30 day intervals begin from the min(date) as shown in Interval 1 and are counted every 30 days.



Ive looked at other questions such as :



Group rows by 7 days interval starting from a certain date



MySQL query to select min datetime grouped by 30 day intervals



However it did not seem to answer my specific problem.



Ive also looked into pivot syntax but noticed it is only supported for certain DBMS.



Any help would be greatly appreciated.



Thank you.










share|improve this question























  • are you looking to get a variable number of columns, depending on how many 30 day intervals are found?

    – Felipe Hoffa
    Jan 3 at 7:03














2












2








2


1






I am trying to use the min() value of a timestamp as a starting point and then group data by 30 day intervals in order to get a count of occurrences for each unique value within the timestamp date range as columns



i have two tables that i am joining together to get a count. Table 1 (page_creation) has 2 columns labeled link and dt_crtd. Table 2(page visits) has 2 other columns labeled url and date. the tables are being joined by joining table1.link = table2.pagevisits.



After the join i get a table similar to this:



+-------------------+------------------------+
| url | date |
+-------------------+------------------------+
| www.google.com | 2018-01-01 00:00:00' |
| www.google.com | 2018-01-02 00:00:00' |
| www.google.com | 2018-02-01 00:00:00' |
| www.google.com | 2018-02-05 00:00:00' |
| www.google.com | 2018-03-04 00:00:00' |
| www.facebook.com | 2014-01-05 00:00:00' |
| www.facebook.com | 2014-01-07 00:00:00' |
| www.facebook.com | 2014-04-02 00:00:00' |
| www.facebook.com | 2014-04-10 00:00:00' |
| www.facebook.com | 2014-04-11 00:00:00' |
| www.facebook.com | 2014-05-01 00:00:00' |
| www.twitter.com | 2016-02-01 00:00:00' |
| www.twitter.com | 2016-03-04 00:00:00' |
+---------------------+----------------------+


what i am trying to get is results that pull this :



+-------------------+------------------------+------------+------------+-------------+
| url | MIN_Date | Interval 1 | Interval 2| Interval 3 |
+-------------------+------------------------+-------------+-----------+-------------+
| www.google.com | 2018-01-01 00:00:00' | 2 | 2 | 1
| www.facebook.com | 2014-01-05 00:00:00' | 2 | 0 | 1
| www.twitter.com | 2016-02-01 00:00:00' | 1 | 1 | 0
+---------------------+----------------------+-------------+-----------+-------------+


So the 30 day intervals begin from the min(date) as shown in Interval 1 and are counted every 30 days.



Ive looked at other questions such as :



Group rows by 7 days interval starting from a certain date



MySQL query to select min datetime grouped by 30 day intervals



However it did not seem to answer my specific problem.



Ive also looked into pivot syntax but noticed it is only supported for certain DBMS.



Any help would be greatly appreciated.



Thank you.










share|improve this question














I am trying to use the min() value of a timestamp as a starting point and then group data by 30 day intervals in order to get a count of occurrences for each unique value within the timestamp date range as columns



i have two tables that i am joining together to get a count. Table 1 (page_creation) has 2 columns labeled link and dt_crtd. Table 2(page visits) has 2 other columns labeled url and date. the tables are being joined by joining table1.link = table2.pagevisits.



After the join i get a table similar to this:



+-------------------+------------------------+
| url | date |
+-------------------+------------------------+
| www.google.com | 2018-01-01 00:00:00' |
| www.google.com | 2018-01-02 00:00:00' |
| www.google.com | 2018-02-01 00:00:00' |
| www.google.com | 2018-02-05 00:00:00' |
| www.google.com | 2018-03-04 00:00:00' |
| www.facebook.com | 2014-01-05 00:00:00' |
| www.facebook.com | 2014-01-07 00:00:00' |
| www.facebook.com | 2014-04-02 00:00:00' |
| www.facebook.com | 2014-04-10 00:00:00' |
| www.facebook.com | 2014-04-11 00:00:00' |
| www.facebook.com | 2014-05-01 00:00:00' |
| www.twitter.com | 2016-02-01 00:00:00' |
| www.twitter.com | 2016-03-04 00:00:00' |
+---------------------+----------------------+


what i am trying to get is results that pull this :



+-------------------+------------------------+------------+------------+-------------+
| url | MIN_Date | Interval 1 | Interval 2| Interval 3 |
+-------------------+------------------------+-------------+-----------+-------------+
| www.google.com | 2018-01-01 00:00:00' | 2 | 2 | 1
| www.facebook.com | 2014-01-05 00:00:00' | 2 | 0 | 1
| www.twitter.com | 2016-02-01 00:00:00' | 1 | 1 | 0
+---------------------+----------------------+-------------+-----------+-------------+


So the 30 day intervals begin from the min(date) as shown in Interval 1 and are counted every 30 days.



Ive looked at other questions such as :



Group rows by 7 days interval starting from a certain date



MySQL query to select min datetime grouped by 30 day intervals



However it did not seem to answer my specific problem.



Ive also looked into pivot syntax but noticed it is only supported for certain DBMS.



Any help would be greatly appreciated.



Thank you.







sql google-bigquery google-query-language






share|improve this question













share|improve this question











share|improve this question




share|improve this question










asked Jan 3 at 5:13









CCPCCP

439




439













  • are you looking to get a variable number of columns, depending on how many 30 day intervals are found?

    – Felipe Hoffa
    Jan 3 at 7:03



















  • are you looking to get a variable number of columns, depending on how many 30 day intervals are found?

    – Felipe Hoffa
    Jan 3 at 7:03

















are you looking to get a variable number of columns, depending on how many 30 day intervals are found?

– Felipe Hoffa
Jan 3 at 7:03





are you looking to get a variable number of columns, depending on how many 30 day intervals are found?

– Felipe Hoffa
Jan 3 at 7:03












3 Answers
3






active

oldest

votes


















1














If you are using BigQuery, I would recommend:





  • countif() to count a boolean value


  • timestamp_add() to add intervals to timestamps


The exact boundaries are a bit vague, but I would go for:



select pc.url,
countif(pv.date >= pc.dt_crtd and
pv.date < timestamp_add(pc.dt_crtd, interval 30 day
) as Interval_00_29,
countif(pv.date >= timestamp_add(pc.dt_crtd, interval 30 day) and
pv.date < timestamp_add(pc.dt_crtd, interval 60 day
) as Interval_30_59,
countif(pv.date >= timestamp_add(pc.dt_crtd, interval 60 day) and
pv.date < timestamp_add(pc.dt_crtd, interval 90 day
) as Interval_60_89
from page_creation pc join
page_visits pv
on pc.link = pv.url
group by pc.url





share|improve this answer































    2














    If I understood your question clearly, you want to calculate page visits between 30 , 60 , 90 days intervals after page creation. If it's the requirement, try below SQL code :-



    select a11.url
    ,Sum(case when a12.date between a11.dt_crtd and a11.dt_crtd+30 then 1 else 0) Interval_1
    ,Sum(case when a12.date between a11.dt_crtd+31 and a11.dt_crtd+60 then 1 else 0) Interval_2
    ,Sum(case when a12.date between a11.dt_crtd+61 and a11.dt_crtd+90 then 1 else 0) Interval_3
    from page_creation a11
    join page_visits a12
    on a11.link = a12.url
    group by a11.url





    share|improve this answer
























    • i will try that when i go into the office later. Does it matter that dt_crtd is a timestamp? In other words will adding +30 work without needing to convert the timestamp into Char()? Also to answer your question, yes you understood my question clearly! Thank you so much.

      – CCP
      Jan 3 at 12:28



















    1














    The way I am reading your scenario and especially based on example of After the join i get a table similar to ... is that you have two tables that you need to UNION - not to JOIN



    So, based on that reading below example is for BigQuery Standard SQL (project.dataset.page_creation and project.dataset.page_visits are here just to mimic your Table 1 and Table2)



    #standardSQL
    WITH `project.dataset.page_creation` AS (
    SELECT 'www.google.com' link, TIMESTAMP '2018-01-01 00:00:00' dt_crtd UNION ALL
    SELECT 'www.facebook.com', '2014-01-05 00:00:00' UNION ALL
    SELECT 'www.twitter.com', '2016-02-01 00:00:00'
    ), `project.dataset.page_visits` AS (
    SELECT 'www.google.com' url, TIMESTAMP '2018-01-02 00:00:00' dt UNION ALL
    SELECT 'www.google.com', '2018-02-01 00:00:00' UNION ALL
    SELECT 'www.google.com', '2018-02-05 00:00:00' UNION ALL
    SELECT 'www.google.com', '2018-03-04 00:00:00' UNION ALL
    SELECT 'www.facebook.com', '2014-01-07 00:00:00' UNION ALL
    SELECT 'www.facebook.com', '2014-04-02 00:00:00' UNION ALL
    SELECT 'www.facebook.com', '2014-04-10 00:00:00' UNION ALL
    SELECT 'www.facebook.com', '2014-04-11 00:00:00' UNION ALL
    SELECT 'www.facebook.com', '2014-05-01 00:00:00' UNION ALL
    SELECT 'www.twitter.com', '2016-03-04 00:00:00'
    ), `After the join` AS (
    SELECT url, dt FROM `project.dataset.page_visits` UNION DISTINCT
    SELECT link, dt_crtd FROM `project.dataset.page_creation`
    )
    SELECT
    url, min_date,
    COUNTIF(dt BETWEEN min_date AND TIMESTAMP_ADD(min_date, INTERVAL 29 DAY)) Interval_1,
    COUNTIF(dt BETWEEN TIMESTAMP_ADD(min_date, INTERVAL 30 DAY) AND TIMESTAMP_ADD(min_date, INTERVAL 59 DAY)) Interval_2,
    COUNTIF(dt BETWEEN TIMESTAMP_ADD(min_date, INTERVAL 60 DAY) AND TIMESTAMP_ADD(min_date, INTERVAL 89 DAY)) Interval_3
    FROM (
    SELECT url, dt, MIN(dt) OVER(PARTITION BY url ORDER BY dt) min_date
    FROM `After the join`
    )
    GROUP BY url, min_date


    with result as



    Row url                 min_date                    Interval_1  Interval_2  Interval_3   
    1 www.facebook.com 2014-01-05 00:00:00 UTC 2 0 1
    2 www.google.com 2018-01-01 00:00:00 UTC 2 2 1
    3 www.twitter.com 2016-02-01 00:00:00 UTC 1 1 0





    share|improve this answer
























      Your Answer






      StackExchange.ifUsing("editor", function () {
      StackExchange.using("externalEditor", function () {
      StackExchange.using("snippets", function () {
      StackExchange.snippets.init();
      });
      });
      }, "code-snippets");

      StackExchange.ready(function() {
      var channelOptions = {
      tags: "".split(" "),
      id: "1"
      };
      initTagRenderer("".split(" "), "".split(" "), channelOptions);

      StackExchange.using("externalEditor", function() {
      // Have to fire editor after snippets, if snippets enabled
      if (StackExchange.settings.snippets.snippetsEnabled) {
      StackExchange.using("snippets", function() {
      createEditor();
      });
      }
      else {
      createEditor();
      }
      });

      function createEditor() {
      StackExchange.prepareEditor({
      heartbeatType: 'answer',
      autoActivateHeartbeat: false,
      convertImagesToLinks: true,
      noModals: true,
      showLowRepImageUploadWarning: true,
      reputationToPostImages: 10,
      bindNavPrevention: true,
      postfix: "",
      imageUploader: {
      brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
      contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
      allowUrls: true
      },
      onDemand: true,
      discardSelector: ".discard-answer"
      ,immediatelyShowMarkdownHelp:true
      });


      }
      });














      draft saved

      draft discarded


















      StackExchange.ready(
      function () {
      StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f54016684%2fis-there-a-way-to-group-timestamp-data-by-30-day-intervals-starting-from-the-min%23new-answer', 'question_page');
      }
      );

      Post as a guest















      Required, but never shown

























      3 Answers
      3






      active

      oldest

      votes








      3 Answers
      3






      active

      oldest

      votes









      active

      oldest

      votes






      active

      oldest

      votes









      1














      If you are using BigQuery, I would recommend:





      • countif() to count a boolean value


      • timestamp_add() to add intervals to timestamps


      The exact boundaries are a bit vague, but I would go for:



      select pc.url,
      countif(pv.date >= pc.dt_crtd and
      pv.date < timestamp_add(pc.dt_crtd, interval 30 day
      ) as Interval_00_29,
      countif(pv.date >= timestamp_add(pc.dt_crtd, interval 30 day) and
      pv.date < timestamp_add(pc.dt_crtd, interval 60 day
      ) as Interval_30_59,
      countif(pv.date >= timestamp_add(pc.dt_crtd, interval 60 day) and
      pv.date < timestamp_add(pc.dt_crtd, interval 90 day
      ) as Interval_60_89
      from page_creation pc join
      page_visits pv
      on pc.link = pv.url
      group by pc.url





      share|improve this answer




























        1














        If you are using BigQuery, I would recommend:





        • countif() to count a boolean value


        • timestamp_add() to add intervals to timestamps


        The exact boundaries are a bit vague, but I would go for:



        select pc.url,
        countif(pv.date >= pc.dt_crtd and
        pv.date < timestamp_add(pc.dt_crtd, interval 30 day
        ) as Interval_00_29,
        countif(pv.date >= timestamp_add(pc.dt_crtd, interval 30 day) and
        pv.date < timestamp_add(pc.dt_crtd, interval 60 day
        ) as Interval_30_59,
        countif(pv.date >= timestamp_add(pc.dt_crtd, interval 60 day) and
        pv.date < timestamp_add(pc.dt_crtd, interval 90 day
        ) as Interval_60_89
        from page_creation pc join
        page_visits pv
        on pc.link = pv.url
        group by pc.url





        share|improve this answer


























          1












          1








          1







          If you are using BigQuery, I would recommend:





          • countif() to count a boolean value


          • timestamp_add() to add intervals to timestamps


          The exact boundaries are a bit vague, but I would go for:



          select pc.url,
          countif(pv.date >= pc.dt_crtd and
          pv.date < timestamp_add(pc.dt_crtd, interval 30 day
          ) as Interval_00_29,
          countif(pv.date >= timestamp_add(pc.dt_crtd, interval 30 day) and
          pv.date < timestamp_add(pc.dt_crtd, interval 60 day
          ) as Interval_30_59,
          countif(pv.date >= timestamp_add(pc.dt_crtd, interval 60 day) and
          pv.date < timestamp_add(pc.dt_crtd, interval 90 day
          ) as Interval_60_89
          from page_creation pc join
          page_visits pv
          on pc.link = pv.url
          group by pc.url





          share|improve this answer













          If you are using BigQuery, I would recommend:





          • countif() to count a boolean value


          • timestamp_add() to add intervals to timestamps


          The exact boundaries are a bit vague, but I would go for:



          select pc.url,
          countif(pv.date >= pc.dt_crtd and
          pv.date < timestamp_add(pc.dt_crtd, interval 30 day
          ) as Interval_00_29,
          countif(pv.date >= timestamp_add(pc.dt_crtd, interval 30 day) and
          pv.date < timestamp_add(pc.dt_crtd, interval 60 day
          ) as Interval_30_59,
          countif(pv.date >= timestamp_add(pc.dt_crtd, interval 60 day) and
          pv.date < timestamp_add(pc.dt_crtd, interval 90 day
          ) as Interval_60_89
          from page_creation pc join
          page_visits pv
          on pc.link = pv.url
          group by pc.url






          share|improve this answer












          share|improve this answer



          share|improve this answer










          answered Jan 3 at 12:30









          Gordon LinoffGordon Linoff

          794k37318421




          794k37318421

























              2














              If I understood your question clearly, you want to calculate page visits between 30 , 60 , 90 days intervals after page creation. If it's the requirement, try below SQL code :-



              select a11.url
              ,Sum(case when a12.date between a11.dt_crtd and a11.dt_crtd+30 then 1 else 0) Interval_1
              ,Sum(case when a12.date between a11.dt_crtd+31 and a11.dt_crtd+60 then 1 else 0) Interval_2
              ,Sum(case when a12.date between a11.dt_crtd+61 and a11.dt_crtd+90 then 1 else 0) Interval_3
              from page_creation a11
              join page_visits a12
              on a11.link = a12.url
              group by a11.url





              share|improve this answer
























              • i will try that when i go into the office later. Does it matter that dt_crtd is a timestamp? In other words will adding +30 work without needing to convert the timestamp into Char()? Also to answer your question, yes you understood my question clearly! Thank you so much.

                – CCP
                Jan 3 at 12:28
















              2














              If I understood your question clearly, you want to calculate page visits between 30 , 60 , 90 days intervals after page creation. If it's the requirement, try below SQL code :-



              select a11.url
              ,Sum(case when a12.date between a11.dt_crtd and a11.dt_crtd+30 then 1 else 0) Interval_1
              ,Sum(case when a12.date between a11.dt_crtd+31 and a11.dt_crtd+60 then 1 else 0) Interval_2
              ,Sum(case when a12.date between a11.dt_crtd+61 and a11.dt_crtd+90 then 1 else 0) Interval_3
              from page_creation a11
              join page_visits a12
              on a11.link = a12.url
              group by a11.url





              share|improve this answer
























              • i will try that when i go into the office later. Does it matter that dt_crtd is a timestamp? In other words will adding +30 work without needing to convert the timestamp into Char()? Also to answer your question, yes you understood my question clearly! Thank you so much.

                – CCP
                Jan 3 at 12:28














              2












              2








              2







              If I understood your question clearly, you want to calculate page visits between 30 , 60 , 90 days intervals after page creation. If it's the requirement, try below SQL code :-



              select a11.url
              ,Sum(case when a12.date between a11.dt_crtd and a11.dt_crtd+30 then 1 else 0) Interval_1
              ,Sum(case when a12.date between a11.dt_crtd+31 and a11.dt_crtd+60 then 1 else 0) Interval_2
              ,Sum(case when a12.date between a11.dt_crtd+61 and a11.dt_crtd+90 then 1 else 0) Interval_3
              from page_creation a11
              join page_visits a12
              on a11.link = a12.url
              group by a11.url





              share|improve this answer













              If I understood your question clearly, you want to calculate page visits between 30 , 60 , 90 days intervals after page creation. If it's the requirement, try below SQL code :-



              select a11.url
              ,Sum(case when a12.date between a11.dt_crtd and a11.dt_crtd+30 then 1 else 0) Interval_1
              ,Sum(case when a12.date between a11.dt_crtd+31 and a11.dt_crtd+60 then 1 else 0) Interval_2
              ,Sum(case when a12.date between a11.dt_crtd+61 and a11.dt_crtd+90 then 1 else 0) Interval_3
              from page_creation a11
              join page_visits a12
              on a11.link = a12.url
              group by a11.url






              share|improve this answer












              share|improve this answer



              share|improve this answer










              answered Jan 3 at 7:54









              GauravSGauravS

              865510




              865510













              • i will try that when i go into the office later. Does it matter that dt_crtd is a timestamp? In other words will adding +30 work without needing to convert the timestamp into Char()? Also to answer your question, yes you understood my question clearly! Thank you so much.

                – CCP
                Jan 3 at 12:28



















              • i will try that when i go into the office later. Does it matter that dt_crtd is a timestamp? In other words will adding +30 work without needing to convert the timestamp into Char()? Also to answer your question, yes you understood my question clearly! Thank you so much.

                – CCP
                Jan 3 at 12:28

















              i will try that when i go into the office later. Does it matter that dt_crtd is a timestamp? In other words will adding +30 work without needing to convert the timestamp into Char()? Also to answer your question, yes you understood my question clearly! Thank you so much.

              – CCP
              Jan 3 at 12:28





              i will try that when i go into the office later. Does it matter that dt_crtd is a timestamp? In other words will adding +30 work without needing to convert the timestamp into Char()? Also to answer your question, yes you understood my question clearly! Thank you so much.

              – CCP
              Jan 3 at 12:28











              1














              The way I am reading your scenario and especially based on example of After the join i get a table similar to ... is that you have two tables that you need to UNION - not to JOIN



              So, based on that reading below example is for BigQuery Standard SQL (project.dataset.page_creation and project.dataset.page_visits are here just to mimic your Table 1 and Table2)



              #standardSQL
              WITH `project.dataset.page_creation` AS (
              SELECT 'www.google.com' link, TIMESTAMP '2018-01-01 00:00:00' dt_crtd UNION ALL
              SELECT 'www.facebook.com', '2014-01-05 00:00:00' UNION ALL
              SELECT 'www.twitter.com', '2016-02-01 00:00:00'
              ), `project.dataset.page_visits` AS (
              SELECT 'www.google.com' url, TIMESTAMP '2018-01-02 00:00:00' dt UNION ALL
              SELECT 'www.google.com', '2018-02-01 00:00:00' UNION ALL
              SELECT 'www.google.com', '2018-02-05 00:00:00' UNION ALL
              SELECT 'www.google.com', '2018-03-04 00:00:00' UNION ALL
              SELECT 'www.facebook.com', '2014-01-07 00:00:00' UNION ALL
              SELECT 'www.facebook.com', '2014-04-02 00:00:00' UNION ALL
              SELECT 'www.facebook.com', '2014-04-10 00:00:00' UNION ALL
              SELECT 'www.facebook.com', '2014-04-11 00:00:00' UNION ALL
              SELECT 'www.facebook.com', '2014-05-01 00:00:00' UNION ALL
              SELECT 'www.twitter.com', '2016-03-04 00:00:00'
              ), `After the join` AS (
              SELECT url, dt FROM `project.dataset.page_visits` UNION DISTINCT
              SELECT link, dt_crtd FROM `project.dataset.page_creation`
              )
              SELECT
              url, min_date,
              COUNTIF(dt BETWEEN min_date AND TIMESTAMP_ADD(min_date, INTERVAL 29 DAY)) Interval_1,
              COUNTIF(dt BETWEEN TIMESTAMP_ADD(min_date, INTERVAL 30 DAY) AND TIMESTAMP_ADD(min_date, INTERVAL 59 DAY)) Interval_2,
              COUNTIF(dt BETWEEN TIMESTAMP_ADD(min_date, INTERVAL 60 DAY) AND TIMESTAMP_ADD(min_date, INTERVAL 89 DAY)) Interval_3
              FROM (
              SELECT url, dt, MIN(dt) OVER(PARTITION BY url ORDER BY dt) min_date
              FROM `After the join`
              )
              GROUP BY url, min_date


              with result as



              Row url                 min_date                    Interval_1  Interval_2  Interval_3   
              1 www.facebook.com 2014-01-05 00:00:00 UTC 2 0 1
              2 www.google.com 2018-01-01 00:00:00 UTC 2 2 1
              3 www.twitter.com 2016-02-01 00:00:00 UTC 1 1 0





              share|improve this answer




























                1














                The way I am reading your scenario and especially based on example of After the join i get a table similar to ... is that you have two tables that you need to UNION - not to JOIN



                So, based on that reading below example is for BigQuery Standard SQL (project.dataset.page_creation and project.dataset.page_visits are here just to mimic your Table 1 and Table2)



                #standardSQL
                WITH `project.dataset.page_creation` AS (
                SELECT 'www.google.com' link, TIMESTAMP '2018-01-01 00:00:00' dt_crtd UNION ALL
                SELECT 'www.facebook.com', '2014-01-05 00:00:00' UNION ALL
                SELECT 'www.twitter.com', '2016-02-01 00:00:00'
                ), `project.dataset.page_visits` AS (
                SELECT 'www.google.com' url, TIMESTAMP '2018-01-02 00:00:00' dt UNION ALL
                SELECT 'www.google.com', '2018-02-01 00:00:00' UNION ALL
                SELECT 'www.google.com', '2018-02-05 00:00:00' UNION ALL
                SELECT 'www.google.com', '2018-03-04 00:00:00' UNION ALL
                SELECT 'www.facebook.com', '2014-01-07 00:00:00' UNION ALL
                SELECT 'www.facebook.com', '2014-04-02 00:00:00' UNION ALL
                SELECT 'www.facebook.com', '2014-04-10 00:00:00' UNION ALL
                SELECT 'www.facebook.com', '2014-04-11 00:00:00' UNION ALL
                SELECT 'www.facebook.com', '2014-05-01 00:00:00' UNION ALL
                SELECT 'www.twitter.com', '2016-03-04 00:00:00'
                ), `After the join` AS (
                SELECT url, dt FROM `project.dataset.page_visits` UNION DISTINCT
                SELECT link, dt_crtd FROM `project.dataset.page_creation`
                )
                SELECT
                url, min_date,
                COUNTIF(dt BETWEEN min_date AND TIMESTAMP_ADD(min_date, INTERVAL 29 DAY)) Interval_1,
                COUNTIF(dt BETWEEN TIMESTAMP_ADD(min_date, INTERVAL 30 DAY) AND TIMESTAMP_ADD(min_date, INTERVAL 59 DAY)) Interval_2,
                COUNTIF(dt BETWEEN TIMESTAMP_ADD(min_date, INTERVAL 60 DAY) AND TIMESTAMP_ADD(min_date, INTERVAL 89 DAY)) Interval_3
                FROM (
                SELECT url, dt, MIN(dt) OVER(PARTITION BY url ORDER BY dt) min_date
                FROM `After the join`
                )
                GROUP BY url, min_date


                with result as



                Row url                 min_date                    Interval_1  Interval_2  Interval_3   
                1 www.facebook.com 2014-01-05 00:00:00 UTC 2 0 1
                2 www.google.com 2018-01-01 00:00:00 UTC 2 2 1
                3 www.twitter.com 2016-02-01 00:00:00 UTC 1 1 0





                share|improve this answer


























                  1












                  1








                  1







                  The way I am reading your scenario and especially based on example of After the join i get a table similar to ... is that you have two tables that you need to UNION - not to JOIN



                  So, based on that reading below example is for BigQuery Standard SQL (project.dataset.page_creation and project.dataset.page_visits are here just to mimic your Table 1 and Table2)



                  #standardSQL
                  WITH `project.dataset.page_creation` AS (
                  SELECT 'www.google.com' link, TIMESTAMP '2018-01-01 00:00:00' dt_crtd UNION ALL
                  SELECT 'www.facebook.com', '2014-01-05 00:00:00' UNION ALL
                  SELECT 'www.twitter.com', '2016-02-01 00:00:00'
                  ), `project.dataset.page_visits` AS (
                  SELECT 'www.google.com' url, TIMESTAMP '2018-01-02 00:00:00' dt UNION ALL
                  SELECT 'www.google.com', '2018-02-01 00:00:00' UNION ALL
                  SELECT 'www.google.com', '2018-02-05 00:00:00' UNION ALL
                  SELECT 'www.google.com', '2018-03-04 00:00:00' UNION ALL
                  SELECT 'www.facebook.com', '2014-01-07 00:00:00' UNION ALL
                  SELECT 'www.facebook.com', '2014-04-02 00:00:00' UNION ALL
                  SELECT 'www.facebook.com', '2014-04-10 00:00:00' UNION ALL
                  SELECT 'www.facebook.com', '2014-04-11 00:00:00' UNION ALL
                  SELECT 'www.facebook.com', '2014-05-01 00:00:00' UNION ALL
                  SELECT 'www.twitter.com', '2016-03-04 00:00:00'
                  ), `After the join` AS (
                  SELECT url, dt FROM `project.dataset.page_visits` UNION DISTINCT
                  SELECT link, dt_crtd FROM `project.dataset.page_creation`
                  )
                  SELECT
                  url, min_date,
                  COUNTIF(dt BETWEEN min_date AND TIMESTAMP_ADD(min_date, INTERVAL 29 DAY)) Interval_1,
                  COUNTIF(dt BETWEEN TIMESTAMP_ADD(min_date, INTERVAL 30 DAY) AND TIMESTAMP_ADD(min_date, INTERVAL 59 DAY)) Interval_2,
                  COUNTIF(dt BETWEEN TIMESTAMP_ADD(min_date, INTERVAL 60 DAY) AND TIMESTAMP_ADD(min_date, INTERVAL 89 DAY)) Interval_3
                  FROM (
                  SELECT url, dt, MIN(dt) OVER(PARTITION BY url ORDER BY dt) min_date
                  FROM `After the join`
                  )
                  GROUP BY url, min_date


                  with result as



                  Row url                 min_date                    Interval_1  Interval_2  Interval_3   
                  1 www.facebook.com 2014-01-05 00:00:00 UTC 2 0 1
                  2 www.google.com 2018-01-01 00:00:00 UTC 2 2 1
                  3 www.twitter.com 2016-02-01 00:00:00 UTC 1 1 0





                  share|improve this answer













                  The way I am reading your scenario and especially based on example of After the join i get a table similar to ... is that you have two tables that you need to UNION - not to JOIN



                  So, based on that reading below example is for BigQuery Standard SQL (project.dataset.page_creation and project.dataset.page_visits are here just to mimic your Table 1 and Table2)



                  #standardSQL
                  WITH `project.dataset.page_creation` AS (
                  SELECT 'www.google.com' link, TIMESTAMP '2018-01-01 00:00:00' dt_crtd UNION ALL
                  SELECT 'www.facebook.com', '2014-01-05 00:00:00' UNION ALL
                  SELECT 'www.twitter.com', '2016-02-01 00:00:00'
                  ), `project.dataset.page_visits` AS (
                  SELECT 'www.google.com' url, TIMESTAMP '2018-01-02 00:00:00' dt UNION ALL
                  SELECT 'www.google.com', '2018-02-01 00:00:00' UNION ALL
                  SELECT 'www.google.com', '2018-02-05 00:00:00' UNION ALL
                  SELECT 'www.google.com', '2018-03-04 00:00:00' UNION ALL
                  SELECT 'www.facebook.com', '2014-01-07 00:00:00' UNION ALL
                  SELECT 'www.facebook.com', '2014-04-02 00:00:00' UNION ALL
                  SELECT 'www.facebook.com', '2014-04-10 00:00:00' UNION ALL
                  SELECT 'www.facebook.com', '2014-04-11 00:00:00' UNION ALL
                  SELECT 'www.facebook.com', '2014-05-01 00:00:00' UNION ALL
                  SELECT 'www.twitter.com', '2016-03-04 00:00:00'
                  ), `After the join` AS (
                  SELECT url, dt FROM `project.dataset.page_visits` UNION DISTINCT
                  SELECT link, dt_crtd FROM `project.dataset.page_creation`
                  )
                  SELECT
                  url, min_date,
                  COUNTIF(dt BETWEEN min_date AND TIMESTAMP_ADD(min_date, INTERVAL 29 DAY)) Interval_1,
                  COUNTIF(dt BETWEEN TIMESTAMP_ADD(min_date, INTERVAL 30 DAY) AND TIMESTAMP_ADD(min_date, INTERVAL 59 DAY)) Interval_2,
                  COUNTIF(dt BETWEEN TIMESTAMP_ADD(min_date, INTERVAL 60 DAY) AND TIMESTAMP_ADD(min_date, INTERVAL 89 DAY)) Interval_3
                  FROM (
                  SELECT url, dt, MIN(dt) OVER(PARTITION BY url ORDER BY dt) min_date
                  FROM `After the join`
                  )
                  GROUP BY url, min_date


                  with result as



                  Row url                 min_date                    Interval_1  Interval_2  Interval_3   
                  1 www.facebook.com 2014-01-05 00:00:00 UTC 2 0 1
                  2 www.google.com 2018-01-01 00:00:00 UTC 2 2 1
                  3 www.twitter.com 2016-02-01 00:00:00 UTC 1 1 0






                  share|improve this answer












                  share|improve this answer



                  share|improve this answer










                  answered Jan 3 at 22:18









                  Mikhail BerlyantMikhail Berlyant

                  62.9k43874




                  62.9k43874






























                      draft saved

                      draft discarded




















































                      Thanks for contributing an answer to Stack Overflow!


                      • Please be sure to answer the question. Provide details and share your research!

                      But avoid



                      • Asking for help, clarification, or responding to other answers.

                      • Making statements based on opinion; back them up with references or personal experience.


                      To learn more, see our tips on writing great answers.




                      draft saved


                      draft discarded














                      StackExchange.ready(
                      function () {
                      StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f54016684%2fis-there-a-way-to-group-timestamp-data-by-30-day-intervals-starting-from-the-min%23new-answer', 'question_page');
                      }
                      );

                      Post as a guest















                      Required, but never shown





















































                      Required, but never shown














                      Required, but never shown












                      Required, but never shown







                      Required, but never shown

































                      Required, but never shown














                      Required, but never shown












                      Required, but never shown







                      Required, but never shown







                      Popular posts from this blog

                      MongoDB - Not Authorized To Execute Command

                      in spring boot 2.1 many test slices are not allowed anymore due to multiple @BootstrapWith

                      How to fix TextFormField cause rebuild widget in Flutter