How to uncache RDD?












26















I used cache() to cache the data in memory but I realized to see the performance without cached data I need to uncache it to remove data from memory:



rdd.cache();
//doing some computation
...
rdd.uncache()


but I got the error said:




value uncache is not a member of org.apache.spark.rdd.RDD[(Int, Array[Float])]




I don't know how to do the uncache then!










share|improve this question





























    26















    I used cache() to cache the data in memory but I realized to see the performance without cached data I need to uncache it to remove data from memory:



    rdd.cache();
    //doing some computation
    ...
    rdd.uncache()


    but I got the error said:




    value uncache is not a member of org.apache.spark.rdd.RDD[(Int, Array[Float])]




    I don't know how to do the uncache then!










    share|improve this question



























      26












      26








      26


      4






      I used cache() to cache the data in memory but I realized to see the performance without cached data I need to uncache it to remove data from memory:



      rdd.cache();
      //doing some computation
      ...
      rdd.uncache()


      but I got the error said:




      value uncache is not a member of org.apache.spark.rdd.RDD[(Int, Array[Float])]




      I don't know how to do the uncache then!










      share|improve this question
















      I used cache() to cache the data in memory but I realized to see the performance without cached data I need to uncache it to remove data from memory:



      rdd.cache();
      //doing some computation
      ...
      rdd.uncache()


      but I got the error said:




      value uncache is not a member of org.apache.spark.rdd.RDD[(Int, Array[Float])]




      I don't know how to do the uncache then!







      scala apache-spark






      share|improve this question















      share|improve this question













      share|improve this question




      share|improve this question








      edited Sep 20 '14 at 23:41









      Jacek Laskowski

      45.8k18136275




      45.8k18136275










      asked Sep 19 '14 at 16:35









      RubbicRubbic

      55221228




      55221228
























          4 Answers
          4






          active

          oldest

          votes


















          46














          RDD can be uncached using unpersist()



          rdd.unpersist()


          source






          share|improve this answer


























          • Thank you so much @Josh

            – Rubbic
            Sep 19 '14 at 16:55



















          11














          The uncache function doesn't exist. I think that you were looking for unpersist. Which according to the Spark ScalaDoc mark the RDD as non-persistent, and remove all blocks for it from memory and disk.






          share|improve this answer
























          • Thanks you are right. I tried what Josh said and it seems like it's working!

            – Rubbic
            Sep 19 '14 at 16:54











          • It's ok. It's exactly the same answer. ;)

            – eliasah
            Sep 19 '14 at 16:55






          • 3





            It'd be quite useful to merge the answers and remove one. What do you think?

            – Jacek Laskowski
            Sep 20 '14 at 23:43



















          3














          If you cache the source data in a RDD by using .cache()
          or You have declared small memory.
          or the default memory is used and its about 500 MB for me.
          and you are running the code again and again,



          Then this error occurs.
          Try clearing all RDD at the end of the code, thus each time the code runs, the RDD is created and also cleared from memory.



          Do this by using: RDD_Name.unpersist()






          share|improve this answer

































            2














            If you want to remove all the cached RDDs, use this ::



            for ((k,v) <- sc.getPersistentRDDs) {
            v.unpersist()
            }





            share|improve this answer

























              Your Answer






              StackExchange.ifUsing("editor", function () {
              StackExchange.using("externalEditor", function () {
              StackExchange.using("snippets", function () {
              StackExchange.snippets.init();
              });
              });
              }, "code-snippets");

              StackExchange.ready(function() {
              var channelOptions = {
              tags: "".split(" "),
              id: "1"
              };
              initTagRenderer("".split(" "), "".split(" "), channelOptions);

              StackExchange.using("externalEditor", function() {
              // Have to fire editor after snippets, if snippets enabled
              if (StackExchange.settings.snippets.snippetsEnabled) {
              StackExchange.using("snippets", function() {
              createEditor();
              });
              }
              else {
              createEditor();
              }
              });

              function createEditor() {
              StackExchange.prepareEditor({
              heartbeatType: 'answer',
              autoActivateHeartbeat: false,
              convertImagesToLinks: true,
              noModals: true,
              showLowRepImageUploadWarning: true,
              reputationToPostImages: 10,
              bindNavPrevention: true,
              postfix: "",
              imageUploader: {
              brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
              contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
              allowUrls: true
              },
              onDemand: true,
              discardSelector: ".discard-answer"
              ,immediatelyShowMarkdownHelp:true
              });


              }
              });














              draft saved

              draft discarded


















              StackExchange.ready(
              function () {
              StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f25938567%2fhow-to-uncache-rdd%23new-answer', 'question_page');
              }
              );

              Post as a guest















              Required, but never shown

























              4 Answers
              4






              active

              oldest

              votes








              4 Answers
              4






              active

              oldest

              votes









              active

              oldest

              votes






              active

              oldest

              votes









              46














              RDD can be uncached using unpersist()



              rdd.unpersist()


              source






              share|improve this answer


























              • Thank you so much @Josh

                – Rubbic
                Sep 19 '14 at 16:55
















              46














              RDD can be uncached using unpersist()



              rdd.unpersist()


              source






              share|improve this answer


























              • Thank you so much @Josh

                – Rubbic
                Sep 19 '14 at 16:55














              46












              46








              46







              RDD can be uncached using unpersist()



              rdd.unpersist()


              source






              share|improve this answer















              RDD can be uncached using unpersist()



              rdd.unpersist()


              source







              share|improve this answer














              share|improve this answer



              share|improve this answer








              edited May 21 '18 at 3:53









              mrsrinivas

              16.4k77296




              16.4k77296










              answered Sep 19 '14 at 16:48









              Josh RosenJosh Rosen

              10.4k44261




              10.4k44261













              • Thank you so much @Josh

                – Rubbic
                Sep 19 '14 at 16:55



















              • Thank you so much @Josh

                – Rubbic
                Sep 19 '14 at 16:55

















              Thank you so much @Josh

              – Rubbic
              Sep 19 '14 at 16:55





              Thank you so much @Josh

              – Rubbic
              Sep 19 '14 at 16:55













              11














              The uncache function doesn't exist. I think that you were looking for unpersist. Which according to the Spark ScalaDoc mark the RDD as non-persistent, and remove all blocks for it from memory and disk.






              share|improve this answer
























              • Thanks you are right. I tried what Josh said and it seems like it's working!

                – Rubbic
                Sep 19 '14 at 16:54











              • It's ok. It's exactly the same answer. ;)

                – eliasah
                Sep 19 '14 at 16:55






              • 3





                It'd be quite useful to merge the answers and remove one. What do you think?

                – Jacek Laskowski
                Sep 20 '14 at 23:43
















              11














              The uncache function doesn't exist. I think that you were looking for unpersist. Which according to the Spark ScalaDoc mark the RDD as non-persistent, and remove all blocks for it from memory and disk.






              share|improve this answer
























              • Thanks you are right. I tried what Josh said and it seems like it's working!

                – Rubbic
                Sep 19 '14 at 16:54











              • It's ok. It's exactly the same answer. ;)

                – eliasah
                Sep 19 '14 at 16:55






              • 3





                It'd be quite useful to merge the answers and remove one. What do you think?

                – Jacek Laskowski
                Sep 20 '14 at 23:43














              11












              11








              11







              The uncache function doesn't exist. I think that you were looking for unpersist. Which according to the Spark ScalaDoc mark the RDD as non-persistent, and remove all blocks for it from memory and disk.






              share|improve this answer













              The uncache function doesn't exist. I think that you were looking for unpersist. Which according to the Spark ScalaDoc mark the RDD as non-persistent, and remove all blocks for it from memory and disk.







              share|improve this answer












              share|improve this answer



              share|improve this answer










              answered Sep 19 '14 at 16:49









              eliasaheliasah

              27.5k774117




              27.5k774117













              • Thanks you are right. I tried what Josh said and it seems like it's working!

                – Rubbic
                Sep 19 '14 at 16:54











              • It's ok. It's exactly the same answer. ;)

                – eliasah
                Sep 19 '14 at 16:55






              • 3





                It'd be quite useful to merge the answers and remove one. What do you think?

                – Jacek Laskowski
                Sep 20 '14 at 23:43



















              • Thanks you are right. I tried what Josh said and it seems like it's working!

                – Rubbic
                Sep 19 '14 at 16:54











              • It's ok. It's exactly the same answer. ;)

                – eliasah
                Sep 19 '14 at 16:55






              • 3





                It'd be quite useful to merge the answers and remove one. What do you think?

                – Jacek Laskowski
                Sep 20 '14 at 23:43

















              Thanks you are right. I tried what Josh said and it seems like it's working!

              – Rubbic
              Sep 19 '14 at 16:54





              Thanks you are right. I tried what Josh said and it seems like it's working!

              – Rubbic
              Sep 19 '14 at 16:54













              It's ok. It's exactly the same answer. ;)

              – eliasah
              Sep 19 '14 at 16:55





              It's ok. It's exactly the same answer. ;)

              – eliasah
              Sep 19 '14 at 16:55




              3




              3





              It'd be quite useful to merge the answers and remove one. What do you think?

              – Jacek Laskowski
              Sep 20 '14 at 23:43





              It'd be quite useful to merge the answers and remove one. What do you think?

              – Jacek Laskowski
              Sep 20 '14 at 23:43











              3














              If you cache the source data in a RDD by using .cache()
              or You have declared small memory.
              or the default memory is used and its about 500 MB for me.
              and you are running the code again and again,



              Then this error occurs.
              Try clearing all RDD at the end of the code, thus each time the code runs, the RDD is created and also cleared from memory.



              Do this by using: RDD_Name.unpersist()






              share|improve this answer






























                3














                If you cache the source data in a RDD by using .cache()
                or You have declared small memory.
                or the default memory is used and its about 500 MB for me.
                and you are running the code again and again,



                Then this error occurs.
                Try clearing all RDD at the end of the code, thus each time the code runs, the RDD is created and also cleared from memory.



                Do this by using: RDD_Name.unpersist()






                share|improve this answer




























                  3












                  3








                  3







                  If you cache the source data in a RDD by using .cache()
                  or You have declared small memory.
                  or the default memory is used and its about 500 MB for me.
                  and you are running the code again and again,



                  Then this error occurs.
                  Try clearing all RDD at the end of the code, thus each time the code runs, the RDD is created and also cleared from memory.



                  Do this by using: RDD_Name.unpersist()






                  share|improve this answer















                  If you cache the source data in a RDD by using .cache()
                  or You have declared small memory.
                  or the default memory is used and its about 500 MB for me.
                  and you are running the code again and again,



                  Then this error occurs.
                  Try clearing all RDD at the end of the code, thus each time the code runs, the RDD is created and also cleared from memory.



                  Do this by using: RDD_Name.unpersist()







                  share|improve this answer














                  share|improve this answer



                  share|improve this answer








                  edited Apr 3 '16 at 11:47









                  Alberto Bonsanto

                  10k63875




                  10k63875










                  answered Apr 2 '16 at 20:46









                  Anupam MahapatraAnupam Mahapatra

                  932




                  932























                      2














                      If you want to remove all the cached RDDs, use this ::



                      for ((k,v) <- sc.getPersistentRDDs) {
                      v.unpersist()
                      }





                      share|improve this answer






























                        2














                        If you want to remove all the cached RDDs, use this ::



                        for ((k,v) <- sc.getPersistentRDDs) {
                        v.unpersist()
                        }





                        share|improve this answer




























                          2












                          2








                          2







                          If you want to remove all the cached RDDs, use this ::



                          for ((k,v) <- sc.getPersistentRDDs) {
                          v.unpersist()
                          }





                          share|improve this answer















                          If you want to remove all the cached RDDs, use this ::



                          for ((k,v) <- sc.getPersistentRDDs) {
                          v.unpersist()
                          }






                          share|improve this answer














                          share|improve this answer



                          share|improve this answer








                          edited Jan 2 at 6:26

























                          answered Jan 2 at 5:01









                          SankarSankar

                          6811




                          6811






























                              draft saved

                              draft discarded




















































                              Thanks for contributing an answer to Stack Overflow!


                              • Please be sure to answer the question. Provide details and share your research!

                              But avoid



                              • Asking for help, clarification, or responding to other answers.

                              • Making statements based on opinion; back them up with references or personal experience.


                              To learn more, see our tips on writing great answers.




                              draft saved


                              draft discarded














                              StackExchange.ready(
                              function () {
                              StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f25938567%2fhow-to-uncache-rdd%23new-answer', 'question_page');
                              }
                              );

                              Post as a guest















                              Required, but never shown





















































                              Required, but never shown














                              Required, but never shown












                              Required, but never shown







                              Required, but never shown

































                              Required, but never shown














                              Required, but never shown












                              Required, but never shown







                              Required, but never shown







                              Popular posts from this blog

                              MongoDB - Not Authorized To Execute Command

                              in spring boot 2.1 many test slices are not allowed anymore due to multiple @BootstrapWith

                              How to fix TextFormField cause rebuild widget in Flutter