MiniDFSCluster: HDFS triple slash schema extension wrong FS





.everyoneloves__top-leaderboard:empty,.everyoneloves__mid-leaderboard:empty,.everyoneloves__bot-mid-leaderboard:empty{ height:90px;width:728px;box-sizing:border-box;
}







0















I am using defaultFS setting in configuration of HDFS. I create configuration and then set it explicitly.



  import org.apache.hadoop.conf.Configuration
import org.apache.hadoop.fs.FileSystem

val config = new Configuration()
config.set("fs.defaultFS", "hdfs://localhost:8020")
val fs = FileSystem.get(new URI(filePath), config)


Code seems to work fine most of the time, but for filePath with triple slash I get an error only on a few machines:



 Wrong FS: hdfs:/tmp/hdfstest, expected: hdfs://localhost:8020


One slash appears only in exception message.



Everywhere else in the system I see triple slash: hdfs:///tmp/hdfstest.
Also for the paths like /tmp/hdfstest without triple slash, defaultFS perfectly works.



Would appreciate any piece of advice. Thank you in advance!



UPD: Exception was seen in tests run on MiniDFSCluster. During the tests I used the same MiniDFSCluster with different configurations.










share|improve this question

























  • You should not use URI scheme if you are passing it in config

    – Sachin Janani
    Jan 3 at 12:21











  • @SachinJanani oh, missed NOT in your comment. Should I just specify localhost:8020? Didn't make difference for my test case so far I see. Also I am very confused, why did it work for cases without triple slash. And that's the way it is written in core-site.xml.

    – Valentina
    Jan 3 at 12:40




















0















I am using defaultFS setting in configuration of HDFS. I create configuration and then set it explicitly.



  import org.apache.hadoop.conf.Configuration
import org.apache.hadoop.fs.FileSystem

val config = new Configuration()
config.set("fs.defaultFS", "hdfs://localhost:8020")
val fs = FileSystem.get(new URI(filePath), config)


Code seems to work fine most of the time, but for filePath with triple slash I get an error only on a few machines:



 Wrong FS: hdfs:/tmp/hdfstest, expected: hdfs://localhost:8020


One slash appears only in exception message.



Everywhere else in the system I see triple slash: hdfs:///tmp/hdfstest.
Also for the paths like /tmp/hdfstest without triple slash, defaultFS perfectly works.



Would appreciate any piece of advice. Thank you in advance!



UPD: Exception was seen in tests run on MiniDFSCluster. During the tests I used the same MiniDFSCluster with different configurations.










share|improve this question

























  • You should not use URI scheme if you are passing it in config

    – Sachin Janani
    Jan 3 at 12:21











  • @SachinJanani oh, missed NOT in your comment. Should I just specify localhost:8020? Didn't make difference for my test case so far I see. Also I am very confused, why did it work for cases without triple slash. And that's the way it is written in core-site.xml.

    – Valentina
    Jan 3 at 12:40
















0












0








0








I am using defaultFS setting in configuration of HDFS. I create configuration and then set it explicitly.



  import org.apache.hadoop.conf.Configuration
import org.apache.hadoop.fs.FileSystem

val config = new Configuration()
config.set("fs.defaultFS", "hdfs://localhost:8020")
val fs = FileSystem.get(new URI(filePath), config)


Code seems to work fine most of the time, but for filePath with triple slash I get an error only on a few machines:



 Wrong FS: hdfs:/tmp/hdfstest, expected: hdfs://localhost:8020


One slash appears only in exception message.



Everywhere else in the system I see triple slash: hdfs:///tmp/hdfstest.
Also for the paths like /tmp/hdfstest without triple slash, defaultFS perfectly works.



Would appreciate any piece of advice. Thank you in advance!



UPD: Exception was seen in tests run on MiniDFSCluster. During the tests I used the same MiniDFSCluster with different configurations.










share|improve this question
















I am using defaultFS setting in configuration of HDFS. I create configuration and then set it explicitly.



  import org.apache.hadoop.conf.Configuration
import org.apache.hadoop.fs.FileSystem

val config = new Configuration()
config.set("fs.defaultFS", "hdfs://localhost:8020")
val fs = FileSystem.get(new URI(filePath), config)


Code seems to work fine most of the time, but for filePath with triple slash I get an error only on a few machines:



 Wrong FS: hdfs:/tmp/hdfstest, expected: hdfs://localhost:8020


One slash appears only in exception message.



Everywhere else in the system I see triple slash: hdfs:///tmp/hdfstest.
Also for the paths like /tmp/hdfstest without triple slash, defaultFS perfectly works.



Would appreciate any piece of advice. Thank you in advance!



UPD: Exception was seen in tests run on MiniDFSCluster. During the tests I used the same MiniDFSCluster with different configurations.







scala hadoop hdfs






share|improve this question















share|improve this question













share|improve this question




share|improve this question








edited Jan 7 at 12:01







Valentina

















asked Jan 3 at 9:12









ValentinaValentina

8817




8817













  • You should not use URI scheme if you are passing it in config

    – Sachin Janani
    Jan 3 at 12:21











  • @SachinJanani oh, missed NOT in your comment. Should I just specify localhost:8020? Didn't make difference for my test case so far I see. Also I am very confused, why did it work for cases without triple slash. And that's the way it is written in core-site.xml.

    – Valentina
    Jan 3 at 12:40





















  • You should not use URI scheme if you are passing it in config

    – Sachin Janani
    Jan 3 at 12:21











  • @SachinJanani oh, missed NOT in your comment. Should I just specify localhost:8020? Didn't make difference for my test case so far I see. Also I am very confused, why did it work for cases without triple slash. And that's the way it is written in core-site.xml.

    – Valentina
    Jan 3 at 12:40



















You should not use URI scheme if you are passing it in config

– Sachin Janani
Jan 3 at 12:21





You should not use URI scheme if you are passing it in config

– Sachin Janani
Jan 3 at 12:21













@SachinJanani oh, missed NOT in your comment. Should I just specify localhost:8020? Didn't make difference for my test case so far I see. Also I am very confused, why did it work for cases without triple slash. And that's the way it is written in core-site.xml.

– Valentina
Jan 3 at 12:40







@SachinJanani oh, missed NOT in your comment. Should I just specify localhost:8020? Didn't make difference for my test case so far I see. Also I am very confused, why did it work for cases without triple slash. And that's the way it is written in core-site.xml.

– Valentina
Jan 3 at 12:40














2 Answers
2






active

oldest

votes


















1














Turned out, that it was not an HDFS issue, but MiniDFSCluster test problem.
In test suite, I was creating test cluster and then checking different defaultFS scenarios on it.



MiniDFSCluster has some issues due to sharing of config and certain use cases can result in unexpected results and falsely failing or passing unit tests.



For more info, there is a ticket in Apache.






share|improve this answer































    0














    If you want to use the fs.defaultFS, you should not specify any scheme or authority, so yours paths should look like /path/to/file. Using a URI with a scheme, like hdfs://localhost:port/path/to/file, will ignore the default FS. You shouldn't ever use the HDFS scheme without a host/port like hdfs:/// -- instead you should either rely on the default FS, or explicitly specify a host/port combination.






    share|improve this answer
























    • Do you mean that hdfs:/// shouldn't be used at all? Or in which cases? I am very confused as I saw triple slash used (and working) a lot in EMR, Spark and so on.

      – Valentina
      Jan 5 at 18:51














    Your Answer






    StackExchange.ifUsing("editor", function () {
    StackExchange.using("externalEditor", function () {
    StackExchange.using("snippets", function () {
    StackExchange.snippets.init();
    });
    });
    }, "code-snippets");

    StackExchange.ready(function() {
    var channelOptions = {
    tags: "".split(" "),
    id: "1"
    };
    initTagRenderer("".split(" "), "".split(" "), channelOptions);

    StackExchange.using("externalEditor", function() {
    // Have to fire editor after snippets, if snippets enabled
    if (StackExchange.settings.snippets.snippetsEnabled) {
    StackExchange.using("snippets", function() {
    createEditor();
    });
    }
    else {
    createEditor();
    }
    });

    function createEditor() {
    StackExchange.prepareEditor({
    heartbeatType: 'answer',
    autoActivateHeartbeat: false,
    convertImagesToLinks: true,
    noModals: true,
    showLowRepImageUploadWarning: true,
    reputationToPostImages: 10,
    bindNavPrevention: true,
    postfix: "",
    imageUploader: {
    brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
    contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
    allowUrls: true
    },
    onDemand: true,
    discardSelector: ".discard-answer"
    ,immediatelyShowMarkdownHelp:true
    });


    }
    });














    draft saved

    draft discarded


















    StackExchange.ready(
    function () {
    StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f54019231%2fminidfscluster-hdfs-triple-slash-schema-extension-wrong-fs%23new-answer', 'question_page');
    }
    );

    Post as a guest















    Required, but never shown

























    2 Answers
    2






    active

    oldest

    votes








    2 Answers
    2






    active

    oldest

    votes









    active

    oldest

    votes






    active

    oldest

    votes









    1














    Turned out, that it was not an HDFS issue, but MiniDFSCluster test problem.
    In test suite, I was creating test cluster and then checking different defaultFS scenarios on it.



    MiniDFSCluster has some issues due to sharing of config and certain use cases can result in unexpected results and falsely failing or passing unit tests.



    For more info, there is a ticket in Apache.






    share|improve this answer




























      1














      Turned out, that it was not an HDFS issue, but MiniDFSCluster test problem.
      In test suite, I was creating test cluster and then checking different defaultFS scenarios on it.



      MiniDFSCluster has some issues due to sharing of config and certain use cases can result in unexpected results and falsely failing or passing unit tests.



      For more info, there is a ticket in Apache.






      share|improve this answer


























        1












        1








        1







        Turned out, that it was not an HDFS issue, but MiniDFSCluster test problem.
        In test suite, I was creating test cluster and then checking different defaultFS scenarios on it.



        MiniDFSCluster has some issues due to sharing of config and certain use cases can result in unexpected results and falsely failing or passing unit tests.



        For more info, there is a ticket in Apache.






        share|improve this answer













        Turned out, that it was not an HDFS issue, but MiniDFSCluster test problem.
        In test suite, I was creating test cluster and then checking different defaultFS scenarios on it.



        MiniDFSCluster has some issues due to sharing of config and certain use cases can result in unexpected results and falsely failing or passing unit tests.



        For more info, there is a ticket in Apache.







        share|improve this answer












        share|improve this answer



        share|improve this answer










        answered Jan 7 at 12:07









        ValentinaValentina

        8817




        8817

























            0














            If you want to use the fs.defaultFS, you should not specify any scheme or authority, so yours paths should look like /path/to/file. Using a URI with a scheme, like hdfs://localhost:port/path/to/file, will ignore the default FS. You shouldn't ever use the HDFS scheme without a host/port like hdfs:/// -- instead you should either rely on the default FS, or explicitly specify a host/port combination.






            share|improve this answer
























            • Do you mean that hdfs:/// shouldn't be used at all? Or in which cases? I am very confused as I saw triple slash used (and working) a lot in EMR, Spark and so on.

              – Valentina
              Jan 5 at 18:51


















            0














            If you want to use the fs.defaultFS, you should not specify any scheme or authority, so yours paths should look like /path/to/file. Using a URI with a scheme, like hdfs://localhost:port/path/to/file, will ignore the default FS. You shouldn't ever use the HDFS scheme without a host/port like hdfs:/// -- instead you should either rely on the default FS, or explicitly specify a host/port combination.






            share|improve this answer
























            • Do you mean that hdfs:/// shouldn't be used at all? Or in which cases? I am very confused as I saw triple slash used (and working) a lot in EMR, Spark and so on.

              – Valentina
              Jan 5 at 18:51
















            0












            0








            0







            If you want to use the fs.defaultFS, you should not specify any scheme or authority, so yours paths should look like /path/to/file. Using a URI with a scheme, like hdfs://localhost:port/path/to/file, will ignore the default FS. You shouldn't ever use the HDFS scheme without a host/port like hdfs:/// -- instead you should either rely on the default FS, or explicitly specify a host/port combination.






            share|improve this answer













            If you want to use the fs.defaultFS, you should not specify any scheme or authority, so yours paths should look like /path/to/file. Using a URI with a scheme, like hdfs://localhost:port/path/to/file, will ignore the default FS. You shouldn't ever use the HDFS scheme without a host/port like hdfs:/// -- instead you should either rely on the default FS, or explicitly specify a host/port combination.







            share|improve this answer












            share|improve this answer



            share|improve this answer










            answered Jan 4 at 20:05









            krogkrog

            280213




            280213













            • Do you mean that hdfs:/// shouldn't be used at all? Or in which cases? I am very confused as I saw triple slash used (and working) a lot in EMR, Spark and so on.

              – Valentina
              Jan 5 at 18:51





















            • Do you mean that hdfs:/// shouldn't be used at all? Or in which cases? I am very confused as I saw triple slash used (and working) a lot in EMR, Spark and so on.

              – Valentina
              Jan 5 at 18:51



















            Do you mean that hdfs:/// shouldn't be used at all? Or in which cases? I am very confused as I saw triple slash used (and working) a lot in EMR, Spark and so on.

            – Valentina
            Jan 5 at 18:51







            Do you mean that hdfs:/// shouldn't be used at all? Or in which cases? I am very confused as I saw triple slash used (and working) a lot in EMR, Spark and so on.

            – Valentina
            Jan 5 at 18:51




















            draft saved

            draft discarded




















































            Thanks for contributing an answer to Stack Overflow!


            • Please be sure to answer the question. Provide details and share your research!

            But avoid



            • Asking for help, clarification, or responding to other answers.

            • Making statements based on opinion; back them up with references or personal experience.


            To learn more, see our tips on writing great answers.




            draft saved


            draft discarded














            StackExchange.ready(
            function () {
            StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f54019231%2fminidfscluster-hdfs-triple-slash-schema-extension-wrong-fs%23new-answer', 'question_page');
            }
            );

            Post as a guest















            Required, but never shown





















































            Required, but never shown














            Required, but never shown












            Required, but never shown







            Required, but never shown

































            Required, but never shown














            Required, but never shown












            Required, but never shown







            Required, but never shown







            Popular posts from this blog

            MongoDB - Not Authorized To Execute Command

            How to fix TextFormField cause rebuild widget in Flutter

            Npm cannot find a required file even through it is in the searched directory