SPARK MoveDFS exception












0















We are facing a weird issue with SPARK. We move a file through SPARK from source to destination.



Source and destination is on HIVE and there are minor differences to the columns in the table structure of source to destination.



Log shows the below:



18/12/28 18:29:40 INFO hiveWriterUtil: HDFS files in place: hdfs://bdpdev/tmp/sess5710288177503367165/mobl_data_hour_summ_nsit_s1ap



18/12/28 18:29:40 INFO hiveWriterUtil: MoveDfs: Target location doesn't exist



18/12/28 18:29:40 ERROR ApplicationMaster: User class threw exception: java.lang.reflect.InvocationTargetException



java.lang.reflect.InvocationTargetException



            at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)

at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)

at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)

at java.lang.reflect.Method.invoke(Method.java:498)

at com.informatica.compiler.InfaSparkMain$.main(InfaSparkMain.scala:108)

at com.informatica.compiler.InfaSparkMain.main(InfaSparkMain.scala)

at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)

at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)

at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)

at java.lang.reflect.Method.invoke(Method.java:498)

at org.apache.spark.deploy.yarn.ApplicationMaster$$anon$2.run(ApplicationMaster.scala:637)


Caused by: org.apache.spark.SparkException: MoveDfs: Unable to move source hdfs://bdpdev/tmp/sess5710288177503367165/mobl_data_hour_summ_nsit_s1ap/dt=2018-10-30/hr=13 to destination hdfs://bdpdev/data/dev1/processed/rwdt/gnrl/shar/NDC/insights/mobl_data_hour_summ_nsit_s1ap/dt=2018-10-30/hr=13



            at com.informatica.hive.hiveWriterUtil$.moveFile(hivebkt.scala:739)

at com.informatica.hive.hiveWriterUtil$.moveFileInDfs(hivebkt.scala:678)

at com.informatica.hive.hiveWriterUtil$$anonfun$writeToBucket$2.apply(hivebkt.scala:316)

at com.informatica.hive.hiveWriterUtil$$anonfun$writeToBucket$2.apply(hivebkt.scala:312)

at scala.collection.mutable.HashSet.foreach(HashSet.scala:78)

at com.informatica.hive.hiveWriterUtil$.writeToBucket(hivebkt.scala:312)

at com.informatica.exec.InfaSpark0$.main(InfaSpark0.scala:61)

at com.informatica.exec.InfaSpark0.main(InfaSpark0.scala)

... 11 more


Caused by: java.io.FileNotFoundException: File does not exist: hdfs://bdpdev/data/dev1/processed/rwdt/gnrl/shar/NDC/insights/mobl_data_hour_summ_nsit_s1ap/dt=2018-10-30



            at org.apache.hadoop.hdfs.DistributedFileSystem$20.doCall(DistributedFileSystem.java:1269)

at org.apache.hadoop.hdfs.DistributedFileSystem$20.doCall(DistributedFileSystem.java:1261)

at org.apache.hadoop.fs.FileSystemLinkResolver.resolve(FileSystemLinkResolver.java:81)

at org.apache.hadoop.hdfs.DistributedFileSystem.getFileStatus(DistributedFileSystem.java:1261)

at com.informatica.hive.hiveWriterUtil$.getFileStatus(hivebkt.scala:766)


Issue is the below



hdfs://bdpdev/tmp/sess5710288177503367165/mobl_data_hour_summ_nsit_s1ap/dt=2018-10-30/hr=13 to destination hdfs://bdpdev/data/dev1/processed/rwdt/gnrl/shar/NDC/insights/mobl_data_hour_summ_nsit_s1ap/dt=2018-10-30/hr=13



Destination contains multi partition.
It mentions saying file does not exist for the below:
hdfs://bdpdev/data/dev1/processed/rwdt/gnrl/shar/NDC/insights/mobl_data_hour_summ_nsit_s1ap/dt=2018-10-30/



Ideally, I would expect the partition to get created for the below too
hdfs://bdpdev/data/dev1/processed/rwdt/gnrl/shar/NDC/insights/mobl_data_hour_summ_nsit_s1ap/dt=2018-10-30/



Could someone please assist










share|improve this question





























    0















    We are facing a weird issue with SPARK. We move a file through SPARK from source to destination.



    Source and destination is on HIVE and there are minor differences to the columns in the table structure of source to destination.



    Log shows the below:



    18/12/28 18:29:40 INFO hiveWriterUtil: HDFS files in place: hdfs://bdpdev/tmp/sess5710288177503367165/mobl_data_hour_summ_nsit_s1ap



    18/12/28 18:29:40 INFO hiveWriterUtil: MoveDfs: Target location doesn't exist



    18/12/28 18:29:40 ERROR ApplicationMaster: User class threw exception: java.lang.reflect.InvocationTargetException



    java.lang.reflect.InvocationTargetException



                at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)

    at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)

    at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)

    at java.lang.reflect.Method.invoke(Method.java:498)

    at com.informatica.compiler.InfaSparkMain$.main(InfaSparkMain.scala:108)

    at com.informatica.compiler.InfaSparkMain.main(InfaSparkMain.scala)

    at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)

    at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)

    at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)

    at java.lang.reflect.Method.invoke(Method.java:498)

    at org.apache.spark.deploy.yarn.ApplicationMaster$$anon$2.run(ApplicationMaster.scala:637)


    Caused by: org.apache.spark.SparkException: MoveDfs: Unable to move source hdfs://bdpdev/tmp/sess5710288177503367165/mobl_data_hour_summ_nsit_s1ap/dt=2018-10-30/hr=13 to destination hdfs://bdpdev/data/dev1/processed/rwdt/gnrl/shar/NDC/insights/mobl_data_hour_summ_nsit_s1ap/dt=2018-10-30/hr=13



                at com.informatica.hive.hiveWriterUtil$.moveFile(hivebkt.scala:739)

    at com.informatica.hive.hiveWriterUtil$.moveFileInDfs(hivebkt.scala:678)

    at com.informatica.hive.hiveWriterUtil$$anonfun$writeToBucket$2.apply(hivebkt.scala:316)

    at com.informatica.hive.hiveWriterUtil$$anonfun$writeToBucket$2.apply(hivebkt.scala:312)

    at scala.collection.mutable.HashSet.foreach(HashSet.scala:78)

    at com.informatica.hive.hiveWriterUtil$.writeToBucket(hivebkt.scala:312)

    at com.informatica.exec.InfaSpark0$.main(InfaSpark0.scala:61)

    at com.informatica.exec.InfaSpark0.main(InfaSpark0.scala)

    ... 11 more


    Caused by: java.io.FileNotFoundException: File does not exist: hdfs://bdpdev/data/dev1/processed/rwdt/gnrl/shar/NDC/insights/mobl_data_hour_summ_nsit_s1ap/dt=2018-10-30



                at org.apache.hadoop.hdfs.DistributedFileSystem$20.doCall(DistributedFileSystem.java:1269)

    at org.apache.hadoop.hdfs.DistributedFileSystem$20.doCall(DistributedFileSystem.java:1261)

    at org.apache.hadoop.fs.FileSystemLinkResolver.resolve(FileSystemLinkResolver.java:81)

    at org.apache.hadoop.hdfs.DistributedFileSystem.getFileStatus(DistributedFileSystem.java:1261)

    at com.informatica.hive.hiveWriterUtil$.getFileStatus(hivebkt.scala:766)


    Issue is the below



    hdfs://bdpdev/tmp/sess5710288177503367165/mobl_data_hour_summ_nsit_s1ap/dt=2018-10-30/hr=13 to destination hdfs://bdpdev/data/dev1/processed/rwdt/gnrl/shar/NDC/insights/mobl_data_hour_summ_nsit_s1ap/dt=2018-10-30/hr=13



    Destination contains multi partition.
    It mentions saying file does not exist for the below:
    hdfs://bdpdev/data/dev1/processed/rwdt/gnrl/shar/NDC/insights/mobl_data_hour_summ_nsit_s1ap/dt=2018-10-30/



    Ideally, I would expect the partition to get created for the below too
    hdfs://bdpdev/data/dev1/processed/rwdt/gnrl/shar/NDC/insights/mobl_data_hour_summ_nsit_s1ap/dt=2018-10-30/



    Could someone please assist










    share|improve this question



























      0












      0








      0








      We are facing a weird issue with SPARK. We move a file through SPARK from source to destination.



      Source and destination is on HIVE and there are minor differences to the columns in the table structure of source to destination.



      Log shows the below:



      18/12/28 18:29:40 INFO hiveWriterUtil: HDFS files in place: hdfs://bdpdev/tmp/sess5710288177503367165/mobl_data_hour_summ_nsit_s1ap



      18/12/28 18:29:40 INFO hiveWriterUtil: MoveDfs: Target location doesn't exist



      18/12/28 18:29:40 ERROR ApplicationMaster: User class threw exception: java.lang.reflect.InvocationTargetException



      java.lang.reflect.InvocationTargetException



                  at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)

      at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)

      at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)

      at java.lang.reflect.Method.invoke(Method.java:498)

      at com.informatica.compiler.InfaSparkMain$.main(InfaSparkMain.scala:108)

      at com.informatica.compiler.InfaSparkMain.main(InfaSparkMain.scala)

      at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)

      at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)

      at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)

      at java.lang.reflect.Method.invoke(Method.java:498)

      at org.apache.spark.deploy.yarn.ApplicationMaster$$anon$2.run(ApplicationMaster.scala:637)


      Caused by: org.apache.spark.SparkException: MoveDfs: Unable to move source hdfs://bdpdev/tmp/sess5710288177503367165/mobl_data_hour_summ_nsit_s1ap/dt=2018-10-30/hr=13 to destination hdfs://bdpdev/data/dev1/processed/rwdt/gnrl/shar/NDC/insights/mobl_data_hour_summ_nsit_s1ap/dt=2018-10-30/hr=13



                  at com.informatica.hive.hiveWriterUtil$.moveFile(hivebkt.scala:739)

      at com.informatica.hive.hiveWriterUtil$.moveFileInDfs(hivebkt.scala:678)

      at com.informatica.hive.hiveWriterUtil$$anonfun$writeToBucket$2.apply(hivebkt.scala:316)

      at com.informatica.hive.hiveWriterUtil$$anonfun$writeToBucket$2.apply(hivebkt.scala:312)

      at scala.collection.mutable.HashSet.foreach(HashSet.scala:78)

      at com.informatica.hive.hiveWriterUtil$.writeToBucket(hivebkt.scala:312)

      at com.informatica.exec.InfaSpark0$.main(InfaSpark0.scala:61)

      at com.informatica.exec.InfaSpark0.main(InfaSpark0.scala)

      ... 11 more


      Caused by: java.io.FileNotFoundException: File does not exist: hdfs://bdpdev/data/dev1/processed/rwdt/gnrl/shar/NDC/insights/mobl_data_hour_summ_nsit_s1ap/dt=2018-10-30



                  at org.apache.hadoop.hdfs.DistributedFileSystem$20.doCall(DistributedFileSystem.java:1269)

      at org.apache.hadoop.hdfs.DistributedFileSystem$20.doCall(DistributedFileSystem.java:1261)

      at org.apache.hadoop.fs.FileSystemLinkResolver.resolve(FileSystemLinkResolver.java:81)

      at org.apache.hadoop.hdfs.DistributedFileSystem.getFileStatus(DistributedFileSystem.java:1261)

      at com.informatica.hive.hiveWriterUtil$.getFileStatus(hivebkt.scala:766)


      Issue is the below



      hdfs://bdpdev/tmp/sess5710288177503367165/mobl_data_hour_summ_nsit_s1ap/dt=2018-10-30/hr=13 to destination hdfs://bdpdev/data/dev1/processed/rwdt/gnrl/shar/NDC/insights/mobl_data_hour_summ_nsit_s1ap/dt=2018-10-30/hr=13



      Destination contains multi partition.
      It mentions saying file does not exist for the below:
      hdfs://bdpdev/data/dev1/processed/rwdt/gnrl/shar/NDC/insights/mobl_data_hour_summ_nsit_s1ap/dt=2018-10-30/



      Ideally, I would expect the partition to get created for the below too
      hdfs://bdpdev/data/dev1/processed/rwdt/gnrl/shar/NDC/insights/mobl_data_hour_summ_nsit_s1ap/dt=2018-10-30/



      Could someone please assist










      share|improve this question
















      We are facing a weird issue with SPARK. We move a file through SPARK from source to destination.



      Source and destination is on HIVE and there are minor differences to the columns in the table structure of source to destination.



      Log shows the below:



      18/12/28 18:29:40 INFO hiveWriterUtil: HDFS files in place: hdfs://bdpdev/tmp/sess5710288177503367165/mobl_data_hour_summ_nsit_s1ap



      18/12/28 18:29:40 INFO hiveWriterUtil: MoveDfs: Target location doesn't exist



      18/12/28 18:29:40 ERROR ApplicationMaster: User class threw exception: java.lang.reflect.InvocationTargetException



      java.lang.reflect.InvocationTargetException



                  at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)

      at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)

      at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)

      at java.lang.reflect.Method.invoke(Method.java:498)

      at com.informatica.compiler.InfaSparkMain$.main(InfaSparkMain.scala:108)

      at com.informatica.compiler.InfaSparkMain.main(InfaSparkMain.scala)

      at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)

      at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)

      at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)

      at java.lang.reflect.Method.invoke(Method.java:498)

      at org.apache.spark.deploy.yarn.ApplicationMaster$$anon$2.run(ApplicationMaster.scala:637)


      Caused by: org.apache.spark.SparkException: MoveDfs: Unable to move source hdfs://bdpdev/tmp/sess5710288177503367165/mobl_data_hour_summ_nsit_s1ap/dt=2018-10-30/hr=13 to destination hdfs://bdpdev/data/dev1/processed/rwdt/gnrl/shar/NDC/insights/mobl_data_hour_summ_nsit_s1ap/dt=2018-10-30/hr=13



                  at com.informatica.hive.hiveWriterUtil$.moveFile(hivebkt.scala:739)

      at com.informatica.hive.hiveWriterUtil$.moveFileInDfs(hivebkt.scala:678)

      at com.informatica.hive.hiveWriterUtil$$anonfun$writeToBucket$2.apply(hivebkt.scala:316)

      at com.informatica.hive.hiveWriterUtil$$anonfun$writeToBucket$2.apply(hivebkt.scala:312)

      at scala.collection.mutable.HashSet.foreach(HashSet.scala:78)

      at com.informatica.hive.hiveWriterUtil$.writeToBucket(hivebkt.scala:312)

      at com.informatica.exec.InfaSpark0$.main(InfaSpark0.scala:61)

      at com.informatica.exec.InfaSpark0.main(InfaSpark0.scala)

      ... 11 more


      Caused by: java.io.FileNotFoundException: File does not exist: hdfs://bdpdev/data/dev1/processed/rwdt/gnrl/shar/NDC/insights/mobl_data_hour_summ_nsit_s1ap/dt=2018-10-30



                  at org.apache.hadoop.hdfs.DistributedFileSystem$20.doCall(DistributedFileSystem.java:1269)

      at org.apache.hadoop.hdfs.DistributedFileSystem$20.doCall(DistributedFileSystem.java:1261)

      at org.apache.hadoop.fs.FileSystemLinkResolver.resolve(FileSystemLinkResolver.java:81)

      at org.apache.hadoop.hdfs.DistributedFileSystem.getFileStatus(DistributedFileSystem.java:1261)

      at com.informatica.hive.hiveWriterUtil$.getFileStatus(hivebkt.scala:766)


      Issue is the below



      hdfs://bdpdev/tmp/sess5710288177503367165/mobl_data_hour_summ_nsit_s1ap/dt=2018-10-30/hr=13 to destination hdfs://bdpdev/data/dev1/processed/rwdt/gnrl/shar/NDC/insights/mobl_data_hour_summ_nsit_s1ap/dt=2018-10-30/hr=13



      Destination contains multi partition.
      It mentions saying file does not exist for the below:
      hdfs://bdpdev/data/dev1/processed/rwdt/gnrl/shar/NDC/insights/mobl_data_hour_summ_nsit_s1ap/dt=2018-10-30/



      Ideally, I would expect the partition to get created for the below too
      hdfs://bdpdev/data/dev1/processed/rwdt/gnrl/shar/NDC/insights/mobl_data_hour_summ_nsit_s1ap/dt=2018-10-30/



      Could someone please assist







      apache-spark hive mapreduce bigdata hadoop2






      share|improve this question















      share|improve this question













      share|improve this question




      share|improve this question








      edited Jan 2 at 9:28









      Naveen Nelamali

      350113




      350113










      asked Jan 1 at 17:22









      Sridar VSridar V

      189




      189
























          0






          active

          oldest

          votes











          Your Answer






          StackExchange.ifUsing("editor", function () {
          StackExchange.using("externalEditor", function () {
          StackExchange.using("snippets", function () {
          StackExchange.snippets.init();
          });
          });
          }, "code-snippets");

          StackExchange.ready(function() {
          var channelOptions = {
          tags: "".split(" "),
          id: "1"
          };
          initTagRenderer("".split(" "), "".split(" "), channelOptions);

          StackExchange.using("externalEditor", function() {
          // Have to fire editor after snippets, if snippets enabled
          if (StackExchange.settings.snippets.snippetsEnabled) {
          StackExchange.using("snippets", function() {
          createEditor();
          });
          }
          else {
          createEditor();
          }
          });

          function createEditor() {
          StackExchange.prepareEditor({
          heartbeatType: 'answer',
          autoActivateHeartbeat: false,
          convertImagesToLinks: true,
          noModals: true,
          showLowRepImageUploadWarning: true,
          reputationToPostImages: 10,
          bindNavPrevention: true,
          postfix: "",
          imageUploader: {
          brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
          contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
          allowUrls: true
          },
          onDemand: true,
          discardSelector: ".discard-answer"
          ,immediatelyShowMarkdownHelp:true
          });


          }
          });














          draft saved

          draft discarded


















          StackExchange.ready(
          function () {
          StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53997464%2fspark-movedfs-exception%23new-answer', 'question_page');
          }
          );

          Post as a guest















          Required, but never shown

























          0






          active

          oldest

          votes








          0






          active

          oldest

          votes









          active

          oldest

          votes






          active

          oldest

          votes
















          draft saved

          draft discarded




















































          Thanks for contributing an answer to Stack Overflow!


          • Please be sure to answer the question. Provide details and share your research!

          But avoid



          • Asking for help, clarification, or responding to other answers.

          • Making statements based on opinion; back them up with references or personal experience.


          To learn more, see our tips on writing great answers.




          draft saved


          draft discarded














          StackExchange.ready(
          function () {
          StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53997464%2fspark-movedfs-exception%23new-answer', 'question_page');
          }
          );

          Post as a guest















          Required, but never shown





















































          Required, but never shown














          Required, but never shown












          Required, but never shown







          Required, but never shown

































          Required, but never shown














          Required, but never shown












          Required, but never shown







          Required, but never shown







          Popular posts from this blog

          MongoDB - Not Authorized To Execute Command

          in spring boot 2.1 many test slices are not allowed anymore due to multiple @BootstrapWith

          How to fix TextFormField cause rebuild widget in Flutter