dynamically bind variable/parameter in Spark SQL?












11















How to bind variable in Apache Spark SQL? For example:



val sqlContext = new org.apache.spark.sql.hive.HiveContext(sc)
sqlContext.sql("SELECT * FROM src WHERE col1 = ${VAL1}").collect().foreach(println)









share|improve this question





























    11















    How to bind variable in Apache Spark SQL? For example:



    val sqlContext = new org.apache.spark.sql.hive.HiveContext(sc)
    sqlContext.sql("SELECT * FROM src WHERE col1 = ${VAL1}").collect().foreach(println)









    share|improve this question



























      11












      11








      11


      2






      How to bind variable in Apache Spark SQL? For example:



      val sqlContext = new org.apache.spark.sql.hive.HiveContext(sc)
      sqlContext.sql("SELECT * FROM src WHERE col1 = ${VAL1}").collect().foreach(println)









      share|improve this question
















      How to bind variable in Apache Spark SQL? For example:



      val sqlContext = new org.apache.spark.sql.hive.HiveContext(sc)
      sqlContext.sql("SELECT * FROM src WHERE col1 = ${VAL1}").collect().foreach(println)






      scala apache-spark apache-spark-sql apache-spark-2.0






      share|improve this question















      share|improve this question













      share|improve this question




      share|improve this question








      edited Aug 15 '17 at 17:08









      mrsrinivas

      15.3k76888




      15.3k76888










      asked Nov 5 '14 at 10:44









      user3769729user3769729

      56113




      56113
























          3 Answers
          3






          active

          oldest

          votes


















          7














          Spark SQL (as of 1.6 release) does not support bind variables.



          ps. What Ashrith is suggesting is not a bind variable.. You're constructing a string every time. Evey time Spark will parse the query, create execution plan etc. Purpose of bind variables (in RDBMS systems for example) is to cut time on creating execution plan (which can be costly where there are a lot of joins etc). Spark has to have a special API to "parse" a query and then to "bind" variables. Spark does not have this functionality (as of today, Spark 1.6 release).



          Update 8/2018: as of Spark 2.3 there are (still) no bind variables in Spark.






          share|improve this answer


























          • hmm, thats sad. Thanks for updating.

            – Nathan Loyer
            Oct 19 '18 at 0:51



















          6














          I verified it in both Spark shell 2.x shell and Thrift(beeline) as well. I could able to bind a variable in Spark SQL query with set command.



          Query without bind variable:



          select count(1) from mytable;


          Query with bind variable (parameterized):




          1. Spark SQL shell



           set key_tbl=mytable; -- setting mytable to key_tbl to use as ${key_tbl}
          select count(1) from ${key_tbl};


          2. Spark shell



          spark.sql("set key_tbl=mytable")
          spark.sql("select count(1) from ${key_tbl}").collect()



          Both w/w.o bind params the query returns an identical result.




          Note: Don't give any quotes to the value of key as it's table name here.




          Let me know if there are any questions.






          share|improve this answer


























          • with val slqDF = spark.sql("select count(*) from Prd_IN_GeneralEvents where pdid like $'{id}'") Same Error

            – earl
            Jul 13 '17 at 15:46











          • @jam: Try print the SELECT statement rather than running it. you will know, how the query is forming.

            – mrsrinivas
            Jul 13 '17 at 15:50











          • Any reason for -1 ?

            – mrsrinivas
            Aug 13 '18 at 19:29



















          0














          Pyspark



          sqlContext.sql("SELECT * FROM src WHERE col1 = {1} and col2 = {2}".format(VAL1,VAL2).collect().foreach(println)





          share|improve this answer























            Your Answer






            StackExchange.ifUsing("editor", function () {
            StackExchange.using("externalEditor", function () {
            StackExchange.using("snippets", function () {
            StackExchange.snippets.init();
            });
            });
            }, "code-snippets");

            StackExchange.ready(function() {
            var channelOptions = {
            tags: "".split(" "),
            id: "1"
            };
            initTagRenderer("".split(" "), "".split(" "), channelOptions);

            StackExchange.using("externalEditor", function() {
            // Have to fire editor after snippets, if snippets enabled
            if (StackExchange.settings.snippets.snippetsEnabled) {
            StackExchange.using("snippets", function() {
            createEditor();
            });
            }
            else {
            createEditor();
            }
            });

            function createEditor() {
            StackExchange.prepareEditor({
            heartbeatType: 'answer',
            autoActivateHeartbeat: false,
            convertImagesToLinks: true,
            noModals: true,
            showLowRepImageUploadWarning: true,
            reputationToPostImages: 10,
            bindNavPrevention: true,
            postfix: "",
            imageUploader: {
            brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
            contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
            allowUrls: true
            },
            onDemand: true,
            discardSelector: ".discard-answer"
            ,immediatelyShowMarkdownHelp:true
            });


            }
            });














            draft saved

            draft discarded


















            StackExchange.ready(
            function () {
            StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f26755230%2fdynamically-bind-variable-parameter-in-spark-sql%23new-answer', 'question_page');
            }
            );

            Post as a guest















            Required, but never shown

























            3 Answers
            3






            active

            oldest

            votes








            3 Answers
            3






            active

            oldest

            votes









            active

            oldest

            votes






            active

            oldest

            votes









            7














            Spark SQL (as of 1.6 release) does not support bind variables.



            ps. What Ashrith is suggesting is not a bind variable.. You're constructing a string every time. Evey time Spark will parse the query, create execution plan etc. Purpose of bind variables (in RDBMS systems for example) is to cut time on creating execution plan (which can be costly where there are a lot of joins etc). Spark has to have a special API to "parse" a query and then to "bind" variables. Spark does not have this functionality (as of today, Spark 1.6 release).



            Update 8/2018: as of Spark 2.3 there are (still) no bind variables in Spark.






            share|improve this answer


























            • hmm, thats sad. Thanks for updating.

              – Nathan Loyer
              Oct 19 '18 at 0:51
















            7














            Spark SQL (as of 1.6 release) does not support bind variables.



            ps. What Ashrith is suggesting is not a bind variable.. You're constructing a string every time. Evey time Spark will parse the query, create execution plan etc. Purpose of bind variables (in RDBMS systems for example) is to cut time on creating execution plan (which can be costly where there are a lot of joins etc). Spark has to have a special API to "parse" a query and then to "bind" variables. Spark does not have this functionality (as of today, Spark 1.6 release).



            Update 8/2018: as of Spark 2.3 there are (still) no bind variables in Spark.






            share|improve this answer


























            • hmm, thats sad. Thanks for updating.

              – Nathan Loyer
              Oct 19 '18 at 0:51














            7












            7








            7







            Spark SQL (as of 1.6 release) does not support bind variables.



            ps. What Ashrith is suggesting is not a bind variable.. You're constructing a string every time. Evey time Spark will parse the query, create execution plan etc. Purpose of bind variables (in RDBMS systems for example) is to cut time on creating execution plan (which can be costly where there are a lot of joins etc). Spark has to have a special API to "parse" a query and then to "bind" variables. Spark does not have this functionality (as of today, Spark 1.6 release).



            Update 8/2018: as of Spark 2.3 there are (still) no bind variables in Spark.






            share|improve this answer















            Spark SQL (as of 1.6 release) does not support bind variables.



            ps. What Ashrith is suggesting is not a bind variable.. You're constructing a string every time. Evey time Spark will parse the query, create execution plan etc. Purpose of bind variables (in RDBMS systems for example) is to cut time on creating execution plan (which can be costly where there are a lot of joins etc). Spark has to have a special API to "parse" a query and then to "bind" variables. Spark does not have this functionality (as of today, Spark 1.6 release).



            Update 8/2018: as of Spark 2.3 there are (still) no bind variables in Spark.







            share|improve this answer














            share|improve this answer



            share|improve this answer








            edited Dec 26 '18 at 20:49

























            answered Sep 11 '15 at 0:39









            TagarTagar

            4,79434166




            4,79434166













            • hmm, thats sad. Thanks for updating.

              – Nathan Loyer
              Oct 19 '18 at 0:51



















            • hmm, thats sad. Thanks for updating.

              – Nathan Loyer
              Oct 19 '18 at 0:51

















            hmm, thats sad. Thanks for updating.

            – Nathan Loyer
            Oct 19 '18 at 0:51





            hmm, thats sad. Thanks for updating.

            – Nathan Loyer
            Oct 19 '18 at 0:51













            6














            I verified it in both Spark shell 2.x shell and Thrift(beeline) as well. I could able to bind a variable in Spark SQL query with set command.



            Query without bind variable:



            select count(1) from mytable;


            Query with bind variable (parameterized):




            1. Spark SQL shell



             set key_tbl=mytable; -- setting mytable to key_tbl to use as ${key_tbl}
            select count(1) from ${key_tbl};


            2. Spark shell



            spark.sql("set key_tbl=mytable")
            spark.sql("select count(1) from ${key_tbl}").collect()



            Both w/w.o bind params the query returns an identical result.




            Note: Don't give any quotes to the value of key as it's table name here.




            Let me know if there are any questions.






            share|improve this answer


























            • with val slqDF = spark.sql("select count(*) from Prd_IN_GeneralEvents where pdid like $'{id}'") Same Error

              – earl
              Jul 13 '17 at 15:46











            • @jam: Try print the SELECT statement rather than running it. you will know, how the query is forming.

              – mrsrinivas
              Jul 13 '17 at 15:50











            • Any reason for -1 ?

              – mrsrinivas
              Aug 13 '18 at 19:29
















            6














            I verified it in both Spark shell 2.x shell and Thrift(beeline) as well. I could able to bind a variable in Spark SQL query with set command.



            Query without bind variable:



            select count(1) from mytable;


            Query with bind variable (parameterized):




            1. Spark SQL shell



             set key_tbl=mytable; -- setting mytable to key_tbl to use as ${key_tbl}
            select count(1) from ${key_tbl};


            2. Spark shell



            spark.sql("set key_tbl=mytable")
            spark.sql("select count(1) from ${key_tbl}").collect()



            Both w/w.o bind params the query returns an identical result.




            Note: Don't give any quotes to the value of key as it's table name here.




            Let me know if there are any questions.






            share|improve this answer


























            • with val slqDF = spark.sql("select count(*) from Prd_IN_GeneralEvents where pdid like $'{id}'") Same Error

              – earl
              Jul 13 '17 at 15:46











            • @jam: Try print the SELECT statement rather than running it. you will know, how the query is forming.

              – mrsrinivas
              Jul 13 '17 at 15:50











            • Any reason for -1 ?

              – mrsrinivas
              Aug 13 '18 at 19:29














            6












            6








            6







            I verified it in both Spark shell 2.x shell and Thrift(beeline) as well. I could able to bind a variable in Spark SQL query with set command.



            Query without bind variable:



            select count(1) from mytable;


            Query with bind variable (parameterized):




            1. Spark SQL shell



             set key_tbl=mytable; -- setting mytable to key_tbl to use as ${key_tbl}
            select count(1) from ${key_tbl};


            2. Spark shell



            spark.sql("set key_tbl=mytable")
            spark.sql("select count(1) from ${key_tbl}").collect()



            Both w/w.o bind params the query returns an identical result.




            Note: Don't give any quotes to the value of key as it's table name here.




            Let me know if there are any questions.






            share|improve this answer















            I verified it in both Spark shell 2.x shell and Thrift(beeline) as well. I could able to bind a variable in Spark SQL query with set command.



            Query without bind variable:



            select count(1) from mytable;


            Query with bind variable (parameterized):




            1. Spark SQL shell



             set key_tbl=mytable; -- setting mytable to key_tbl to use as ${key_tbl}
            select count(1) from ${key_tbl};


            2. Spark shell



            spark.sql("set key_tbl=mytable")
            spark.sql("select count(1) from ${key_tbl}").collect()



            Both w/w.o bind params the query returns an identical result.




            Note: Don't give any quotes to the value of key as it's table name here.




            Let me know if there are any questions.







            share|improve this answer














            share|improve this answer



            share|improve this answer








            edited Nov 20 '18 at 8:36

























            answered Nov 22 '16 at 6:18









            mrsrinivasmrsrinivas

            15.3k76888




            15.3k76888













            • with val slqDF = spark.sql("select count(*) from Prd_IN_GeneralEvents where pdid like $'{id}'") Same Error

              – earl
              Jul 13 '17 at 15:46











            • @jam: Try print the SELECT statement rather than running it. you will know, how the query is forming.

              – mrsrinivas
              Jul 13 '17 at 15:50











            • Any reason for -1 ?

              – mrsrinivas
              Aug 13 '18 at 19:29



















            • with val slqDF = spark.sql("select count(*) from Prd_IN_GeneralEvents where pdid like $'{id}'") Same Error

              – earl
              Jul 13 '17 at 15:46











            • @jam: Try print the SELECT statement rather than running it. you will know, how the query is forming.

              – mrsrinivas
              Jul 13 '17 at 15:50











            • Any reason for -1 ?

              – mrsrinivas
              Aug 13 '18 at 19:29

















            with val slqDF = spark.sql("select count(*) from Prd_IN_GeneralEvents where pdid like $'{id}'") Same Error

            – earl
            Jul 13 '17 at 15:46





            with val slqDF = spark.sql("select count(*) from Prd_IN_GeneralEvents where pdid like $'{id}'") Same Error

            – earl
            Jul 13 '17 at 15:46













            @jam: Try print the SELECT statement rather than running it. you will know, how the query is forming.

            – mrsrinivas
            Jul 13 '17 at 15:50





            @jam: Try print the SELECT statement rather than running it. you will know, how the query is forming.

            – mrsrinivas
            Jul 13 '17 at 15:50













            Any reason for -1 ?

            – mrsrinivas
            Aug 13 '18 at 19:29





            Any reason for -1 ?

            – mrsrinivas
            Aug 13 '18 at 19:29











            0














            Pyspark



            sqlContext.sql("SELECT * FROM src WHERE col1 = {1} and col2 = {2}".format(VAL1,VAL2).collect().foreach(println)





            share|improve this answer




























              0














              Pyspark



              sqlContext.sql("SELECT * FROM src WHERE col1 = {1} and col2 = {2}".format(VAL1,VAL2).collect().foreach(println)





              share|improve this answer


























                0












                0








                0







                Pyspark



                sqlContext.sql("SELECT * FROM src WHERE col1 = {1} and col2 = {2}".format(VAL1,VAL2).collect().foreach(println)





                share|improve this answer













                Pyspark



                sqlContext.sql("SELECT * FROM src WHERE col1 = {1} and col2 = {2}".format(VAL1,VAL2).collect().foreach(println)






                share|improve this answer












                share|improve this answer



                share|improve this answer










                answered Oct 17 '17 at 7:19









                Vijay KrishnaVijay Krishna

                491315




                491315






























                    draft saved

                    draft discarded




















































                    Thanks for contributing an answer to Stack Overflow!


                    • Please be sure to answer the question. Provide details and share your research!

                    But avoid



                    • Asking for help, clarification, or responding to other answers.

                    • Making statements based on opinion; back them up with references or personal experience.


                    To learn more, see our tips on writing great answers.




                    draft saved


                    draft discarded














                    StackExchange.ready(
                    function () {
                    StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f26755230%2fdynamically-bind-variable-parameter-in-spark-sql%23new-answer', 'question_page');
                    }
                    );

                    Post as a guest















                    Required, but never shown





















































                    Required, but never shown














                    Required, but never shown












                    Required, but never shown







                    Required, but never shown

































                    Required, but never shown














                    Required, but never shown












                    Required, but never shown







                    Required, but never shown







                    Popular posts from this blog

                    Can a sorcerer learn a 5th-level spell early by creating spell slots using the Font of Magic feature?

                    Does disintegrating a polymorphed enemy still kill it after the 2018 errata?

                    A Topological Invariant for $pi_3(U(n))$